SYNOPSIS
bufrextract.pl <bufr file(s)>
[--ahl <ahl_regexp>]
[--only_ahl | --without_ahl | --gts]
[--filter <metadata criteria>]
[--outfile <filename>]
[--help]
[--verbose n]
DESCRIPTION
Extract all BUFR messages and/or corresponding AHLs from BUFR file(s), possibly filtering on AHL and/or metadata in section 1.
The AHL (Abbreviated Header Line) is recognized as the TTAAii CCCC YYGGgg [BBB] immediately preceding the BUFR message.
Execute without arguments for Usage, with option --help for some additional info. See also https://wiki.met.no/bufr.pm/start for examples of use.
OPTIONS
--ahl <ahl_regexp> Extract BUFR messages and/or AHLs with AHL
matching <ahl_regexp> only
--gts Include full gts message envelope if present
--only_ahl Extract AHLs only
--without_ahl Extract BUFR messages only
--filter <metadata criteria>
Extract BUFR messages matching the <metadata criteria> only
--outfile <filename>
Will print to <filename> instead of STDOUT
--help Display Usage and explain the options used. For even
more info you might prefer to consult perldoc bufrextract.pl
--verbose n Set verbose level to n, 0<=n<=6 (default 0)
Options may be abbreviated, e.g. --h or -h for --help.
For option --ahl the <ahl_regexp> should be a Perl regular expression. E.g. --ahl 'ISS... ENMI' will decode only SHIP BUFR (ISS) from CCCC=ENMI.
Use option --gts if you want the full GTS message envelope (if present) to be included in output. There are 2 main variations on this envelope (SOH/ETX and ZCZC notation), for details see the Manual on the GTS: Attachment II-4. Format of Meteorological Messages.
Using --filter makes it possible to filter based on almost any of the metadata present in section 1 (and 0) of the BUFR messages. Some few examples which hopefully are enough to illustrate how to write the <metadata criteria>: according to Common Code Table C-13 of WMO-no. 306, "dc=0 ic=0,1,2,6" should take care of synoptic and one-hour observations from fixed-land stations, while "dc=1 ic=0,6" should do the same for marine stations. If you want to extract both, use for <metadata criteria>: "dc=0 ic=0,1,2,6 | dc=1 ic=0,6".
Here is the full list of metadata available for filtering (the first 2-letter abbreviation is what should be used in the <metadata criteria>):
be = BUFR edition
oc = Originating centre
os = Originating subcentre
dc = Data category (table A)
ic = International data subcategory
lc = Local data subcategory
mt = Master table version number
lt = Local table version number
ye = Year
mo = Month
da = Day
ho = Hour
mi = Minute
se = Second
Note that no bufrtables are needed for running bufrextract.pl, since section 4 in BUFR message will not be decoded (which also speeds up execution quite a bit).
HINTS
With a little knowledge of Perl you could easily extend bufrextract.pl to extract BUFR messages based on whatever information is available in section 0-3, by making your own copy of bufrextract.pl and then employing one of the many get_ subroutines in BUFR.pm. For example, to extract only BUFR messages with TM315009, add the following line just before calling is_filtered() in code:
next if $bufr->get_descriptors_unexpanded() ne '315009';
CAVEAT
Sometimes GTS bulletins are erroneously issued with extra characters between the GTS AHL and the start of BUFR message (besides the standard character sequence CRCRLF), likely leading bufrextract.pl to miss the AHL.
AUTHOR
Pål Sannes <pal.sannes@met.no>
COPYRIGHT
Copyright (C) 2010-2026 MET Norway