DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

binary(n)




______________________________________________________________________________


NAME

       binary - Insert and extract fields from binary strings


SYNOPSIS

       binary format formatString ?arg arg ...?
       binary scan string formatString ?varName varName ...?
_________________________________________________________________


DESCRIPTION

       This  command  provides  facilities  for manipulating binary data.  The
       first form, binary format, creates a binary string from normal Tcl val-
       ues.   For  example,  given the values 16 and 22, on a 32 bit architec-
       ture, it might produce an 8-byte binary string consisting of two 4-byte
       integers, one for each of the numbers.  The second form of the command,
       binary scan, does the opposite: it extracts data from a  binary  string
       and returns it as ordinary Tcl string values.


BINARY FORMAT

       The  binary  format  command  generates a binary string whose layout is
       specified by the formatString and whose contents come  from  the  addi-
       tional arguments.  The resulting binary value is returned.

       The  formatString  consists  of a sequence of zero or more field speci-
       fiers separated by zero or more spaces.  Each field specifier is a sin-
       gle  type  character followed by an optional numeric count.  Most field
       specifiers consume one argument to obtain the value  to  be  formatted.
       The  type  character  specifies  how the value is to be formatted.  The
       count typically indicates how many items  of  the  specified  type  are
       taken  from the value.  If present, the count is a non-negative decimal
       integer or *, which normally indicates that all of  the  items  in  the
       value  are  to  be used.  If the number of arguments does not match the
       number of fields in the format string that consume arguments,  then  an
       error is generated.

       Here is a small example to clarify the relation between the field spec-
       ifiers and the arguments:
              binary format d3d {1.0 2.0 3.0 4.0} 0.1

       The first argument is a list of four numbers, but because of the  count
       of  3  for the associated field specifier, only the first three will be
       used. The second argument is associated with the  second  field  speci-
       fier.  The  resulting binary string contains the four numbers 1.0, 2.0,
       3.0 and 0.1.

       Each type-count pair moves an imaginary cursor through the binary data,
       storing  bytes at the current position and advancing the cursor to just
       after the last byte stored.  The cursor is initially at position  0  at
       the  beginning  of  the data.  The type may be any one of the following
       characters:

       a    Stores a character string of length count in  the  output  string.
            Every character is taken as modulo 256 (i.e. the low byte of every
            character is used, and the high byte discarded)  so  when  storing
            character  strings  not  wholly  expressible  using the characters
            \u0000-\u00ff, the encoding convertto command should be used first
            if  this truncation is not desired (i.e. if the characters are not
            part of the ISO 8859-1 character set.)   If  arg  has  fewer  than
            count  bytes,  then  additional zero bytes are used to pad out the
            field.  If arg is longer than  the  specified  length,  the  extra
            characters  will be ignored.  If count is *, then all of the bytes
            in arg will be formatted.  If count is omitted, then one character
            will be formatted.  For example,
                   binary format a7a*a alpha bravo charlie
            will return a string equivalent to alpha\000\000bravoc.

       A    This form is the same as a except that spaces are used for padding
            instead of nulls.  For example,
                   binary format A6A*A alpha bravo charlie
            will return alpha bravoc.

       b    Stores a string of count binary digits in low-to-high order within
            each  byte in the output string.  Arg must contain a sequence of 1
            and 0 characters.  The resulting bytes are  emitted  in  first  to
            last  order  with  the  bits  being formatted in low-to-high order
            within each byte.  If arg has fewer than count digits, then  zeros
            will  be  used  for  the remaining bits.  If arg has more than the
            specified number of digits, the extra digits will be ignored.   If
            count  is  *, then all of the digits in arg will be formatted.  If
            count is omitted, then one digit will be formatted.  If the number
            of  bits  formatted does not end at a byte boundary, the remaining
            bits of the last byte will be zeros.  For example,
                   binary format b5b* 11100 111000011010
            will return a string equivalent to \x07\x87\x05.

       B    This form is the same as b except that  the  bits  are  stored  in
            high-to-low order within each byte.  For example,
                   binary format B5B* 11100 111000011010
            will return a string equivalent to \xe0\xe1\xa0.

       h    Stores  a string of count hexadecimal digits in low-to-high within
            each byte in the output string.  Arg must contain  a  sequence  of
            characters  in  the set ``0123456789abcdefABCDEF''.  The resulting
            bytes are emitted in first to last order with the hex digits being
            formatted in low-to-high order within each byte.  If arg has fewer
            than count digits, then zeros will be used for the remaining  dig-
            its.   If  arg  has  more than the specified number of digits, the
            extra digits will be ignored.  If count is *, then all of the dig-
            its in arg will be formatted.  If count is omitted, then one digit
            will be formatted.  If the number of digits formatted does not end
            at  a  byte  boundary, the remaining bits of the last byte will be
            zeros.  For example,
                   binary format h3h* AB def
            will return a string equivalent to \xba\x00\xed\x0f.

       H    This form is the same as h except that the digits  are  stored  in
            high-to-low order within each byte.  For example,
                   binary format H3H* ab DEF
            will return a string equivalent to \xab\x00\xde\xf0.

       c    Stores  one or more 8-bit integer values in the output string.  If
            no count is specified, then arg must consist of an integer  value;
            otherwise  arg  must  consist  of a list containing at least count
            integer elements.  The low-order 8 bits of each integer are stored
            as  a  one-byte value at the cursor position.  If count is *, then
            all of the integers in the list are formatted.  If the  number  of
            elements  in the list is fewer than count, then an error is gener-
            ated.  If the number of elements  in  the  list  is  greater  than
            count, then the extra elements are ignored.  For example,
                   binary format c3cc* {3 -3 128 1} 260 {2 5}
            will  return  a  string  equivalent  to  \x03\xfd\x80\x04\x02\x05,
            whereas
                   binary format c {2 5}
            will generate an error.

       s    This form is the same as c except  that  it  stores  one  or  more
            16-bit  integers in little-endian byte order in the output string.
            The low-order 16-bits of each integer are  stored  as  a  two-byte
            value  at  the  cursor  position  with  the least significant byte
            stored first.  For example,
                   binary format s3 {3 -3 258 1}
            will return a string equivalent to \x03\x00\xfd\xff\x02\x01.

       S    This form is the same as s except  that  it  stores  one  or  more
            16-bit  integers  in  big-endian  byte order in the output string.
            For example,
                   binary format S3 {3 -3 258 1}
            will return a string equivalent to \x00\x03\xff\xfd\x01\x02.

       i    This form is the same as c except  that  it  stores  one  or  more
            32-bit  integers in little-endian byte order in the output string.
            The low-order 32-bits of each integer are stored  as  a  four-byte
            value  at  the  cursor  position  with  the least significant byte
            stored first.  For example,
                   binary format i3 {3 -3 65536 1}
            will       return       a       string        equivalent        to
            \x03\x00\x00\x00\xfd\xff\xff\xff\x00\x00\x01\x00

       I    This  form  is the same as i except that it stores one or more one
            or more 32-bit integers in big-endian byte  order  in  the  output
            string.  For example,
                   binary format I3 {3 -3 65536 1}
            will        return        a       string       equivalent       to
            \x00\x00\x00\x03\xff\xff\xff\xfd\x00\x01\x00\x00

       w    This form is the same as c except  that  it  stores  one  or  more |
            64-bit  integers in little-endian byte order in the output string. |
            The low-order 64-bits of each integer are stored as an  eight-byte |
            value  at  the  cursor  position  with  the least significant byte |
            stored first.  For example,                                        |
                   binary format w 7810179016327718216                         |
            will return the string HelloTcl                                    |

       W                                                                       ||
            This  form  is the same as w except that it stores one or more one |
            or more 64-bit integers in big-endian byte  order  in  the  output |
            string.  For example,                                              |
                   binary format Wc 4785469626960341345 110                    |
            will return the string BigEndian

       f    This  form  is the same as c except that it stores one or more one
            or more single-precision floating in the machine's  native  repre-
            sentation in the output string.  This representation is not porta-
            ble across architectures, so it should not be used to  communicate
            floating point numbers across the network.  The size of a floating
            point number may vary across architectures, so the number of bytes
            that are generated may vary.  If the value overflows the machine's
            native representation, then the value of FLT_MAX as defined by the
            system  will  be  used instead.  Because Tcl uses double-precision
            floating-point numbers internally, there may be some loss of  pre-
            cision  in  the conversion to single-precision.  For example, on a
            Windows system running on an Intel Pentium processor,
                   binary format f2 {1.6 3.4}
            will       return       a       string        equivalent        to
            \xcd\xcc\xcc\x3f\x9a\x99\x59\x40.

       d    This  form  is the same as f except that it stores one or more one
            or more double-precision floating in the machine's  native  repre-
            sentation  in the output string.  For example, on a Windows system
            running on an Intel Pentium processor,
                   binary format d1 {1.6}
            will       return       a       string        equivalent        to
            \x9a\x99\x99\x99\x99\x99\xf9\x3f.

       x    Stores  count  null  bytes  in the output string.  If count is not
            specified, stores one null byte.  If  count  is  *,  generates  an
            error.  This type does not consume an argument.  For example,
                   binary format a3xa3x2a3 abc def ghi
            will return a string equivalent to abc\000def\000\000ghi.

       X    Moves  the cursor back count bytes in the output string.  If count
            is * or is larger than the current cursor position, then the  cur-
            sor  is positioned at location 0 so that the next byte stored will
            be the first byte in the result string.  If count is omitted  then
            the  cursor is moved back one byte.  This type does not consume an
            argument.  For example,
                   binary format a3X*a3X2a3 abc def ghi
            will return dghi.

       @    Moves the cursor to the absolute location  in  the  output  string
            specified  by  count.   Position 0 refers to the first byte in the
            output string.  If count refers to a position beyond the last byte
            stored so far, then null bytes will be placed in the uninitialized
            locations and the cursor will be placed at the specified location.
            If  count is *, then the cursor is moved to the current end of the
            output string.  If count is omitted, then an error will be  gener-
            ated.  This type does not consume an argument. For example,
                   binary format a5@2a1@*a3@10a1 abcde f ghi j
            will return abfdeghi\000\000j.


BINARY SCAN

       The  binary  scan command parses fields from a binary string, returning
       the number of conversions performed.  String  gives  the  input  to  be
       parsed  and formatString indicates how to parse it.  Each varName gives
       the name of a variable; when a field is scanned from string the  result
       is assigned to the corresponding variable.

       As  with binary format, the formatString consists of a sequence of zero
       or more field specifiers separated by zero or more spaces.  Each  field
       specifier  is  a  single type character followed by an optional numeric
       count.  Most field specifiers consume one argument to obtain the  vari-
       able  into which the scanned values should be placed.  The type charac-
       ter specifies how the binary data is to be interpreted.  The count typ-
       ically  indicates  how  many items of the specified type are taken from
       the data.  If present, the count is a non-negative decimal  integer  or
       *, which normally indicates that all of the remaining items in the data
       are to be used.  If there are not enough bytes left after  the  current
       cursor position to satisfy the current field specifier, then the corre-
       sponding variable is left untouched and binary scan returns immediately
       with  the  number  of variables that were set.  If there are not enough
       arguments for all of the fields in the format string that consume argu-
       ments, then an error is generated.

       A  similar  example  as  with binary format should explain the relation
       between field specifiers and arguments in case of the binary scan  sub-
       command:
              binary scan $bytes s3s first second

       This  command (provided the binary string in the variable bytes is long
       enough) assigns a list of three integers  to  the  variable  first  and
       assigns a single value to the variable second.  If bytes contains fewer
       than 8 bytes (i.e. four 2-byte integers), no assignment to second  will
       be  made,  and  if bytes contains fewer than 6 bytes (i.e. three 2-byte
       integers), no assignment to first will be made.  Hence:
              puts [binary scan abcdefg s3s first second]
              puts $first
              puts $second
       will print (assuming neither variable is set previously):
              1
              25185 25699 26213
              can't read "second": no such variable

       It is important to note that the c, s, and S (and i and I on 64bit sys-
       tems)  will be scanned into long data size values.  In doing this, val-
       ues that have their high bit set (0x80 for chars,  0x8000  for  shorts,
       0x80000000  for  ints), will be sign extended.  Thus the following will
       occur:
              set signShort [binary format s1 0x8000]
              binary scan $signShort s1 val; # val == 0xFFFF8000
       If you want to produce an unsigned value, then you can mask the  return
       value  to  the desired size.  For example, to produce an unsigned short
       value:
              set val [expr {$val & 0xFFFF}]; # val == 0x8000

       Each type-count pair moves an imaginary cursor through the binary data,
       reading  bytes  from  the current position.  The cursor is initially at
       position 0 at the beginning of the data.  The type may be  any  one  of
       the following characters:

       a    The  data  is  a character string of length count.  If count is *,
            then all of the remaining bytes in string will be scanned into the
            variable.   If  count  is  omitted,  then  one  character  will be
            scanned.  All characters scanned will be interpreted as  being  in
            the  range \u0000-\u00ff so the encoding convertfrom command might
            be needed if the string is not an ISO 8859-1 string.  For example,
                   binary scan abcde\000fghi a6a10 var1 var2
            will  return  1  with the string equivalent to abcde\000 stored in
            var1 and var2 left unmodified.

       A    This form is the same as a, except trailing blanks and  nulls  are
            stripped  from  the scanned value before it is stored in the vari-
            able.  For example,
                   binary scan "abc efghi  \000" A* var1
            will return 1 with abc efghi stored in var1.

       b    The data is turned into a string of count binary digits in low-to-
            high  order  represented  as a sequence of ``1'' and ``0'' charac-
            ters.  The data bytes are scanned in first to last order with  the
            bits being taken in low-to-high order within each byte.  Any extra
            bits in the last byte are ignored.  If count is *, then all of the
            remaining  bits  in  string will be scanned.  If count is omitted,
            then one bit will be scanned.  For example,
                   binary scan \x07\x87\x05 b5b* var1 var2
            will return 2 with  11100  stored  in  var1  and  1110000110100000
            stored in var2.

       B    This  form is the same as b, except the bits are taken in high-to-
            low order within each byte.  For example,
                   binary scan \x70\x87\x05 B5B* var1 var2
            will return 2 with  01110  stored  in  var1  and  1000011100000101
            stored in var2.

       h    The  data  is  turned into a string of count hexadecimal digits in
            low-to-high order represented as a sequence of characters  in  the
            set  ``0123456789abcdef''.  The data bytes are scanned in first to
            last order with the hex digits being taken  in  low-to-high  order
            within  each  byte.   Any extra bits in the last byte are ignored.
            If count is *, then all of the remaining hex digits in string will
            be  scanned.   If  count  is  omitted,  then one hex digit will be
            scanned.  For example,
                   binary scan \x07\x86\x05 h3h* var1 var2
            will return 2 with 706 stored in var1 and 50 stored in var2.

       H    This form is the same as h, except the digits are taken  in  high-
            to-low order within each byte.  For example,
                   binary scan \x07\x86\x05 H3H* var1 var2
            will return 2 with 078 stored in var1 and 05 stored in var2.

       c    The  data is turned into count 8-bit signed integers and stored in
            the corresponding variable as a list. If count is *, then  all  of
            the  remaining bytes in string will be scanned.  If count is omit-
            ted, then one 8-bit integer will be scanned.  For example,
                   binary scan \x07\x86\x05 c2c* var1 var2
            will return 2 with 7 -122 stored in var1 and  5  stored  in  var2.
            Note  that  the integers returned are signed, but they can be con-
            verted to unsigned 8-bit quantities using an expression like:
                   expr { $num & 0xff }

       s    The data is interpreted as count  16-bit  signed  integers  repre-
            sented  in  little-endian  byte order.  The integers are stored in
            the corresponding variable as a list.  If count is *, then all  of
            the  remaining bytes in string will be scanned.  If count is omit-
            ted, then one 16-bit integer will be scanned.  For example,
                   binary scan \x05\x00\x07\x00\xf0\xff s2s* var1 var2
            will return 2 with 5 7 stored in var1  and  -16  stored  in  var2.
            Note  that  the integers returned are signed, but they can be con-
            verted to unsigned 16-bit quantities using an expression like:
                   expr { $num & 0xffff }

       S    This form is the same as s except that the data is interpreted  as
            count 16-bit signed integers represented in big-endian byte order.
            For example,
                   binary scan \x00\x05\x00\x07\xff\xf0 S2S* var1 var2
            will return 2 with 5 7 stored in var1 and -16 stored in var2.

       i    The data is interpreted as count  32-bit  signed  integers  repre-
            sented  in  little-endian  byte order.  The integers are stored in
            the corresponding variable as a list.  If count is *, then all  of
            the  remaining bytes in string will be scanned.  If count is omit-
            ted, then one 32-bit integer will be scanned.  For example,
                   binary scan \x05\x00\x00\x00\x07\x00\x00\x00\xf0\xff\xff\xff i2i* var1 var2
            will return 2 with 5 7 stored in var1  and  -16  stored  in  var2.
            Note  that  the integers returned are signed, but they can be con-
            verted to unsigned 32-bit quantities using an expression like:
                   expr { $num & 0xffffffff }

       I    This form is the same as I except that the data is interpreted  as
            count 32-bit signed integers represented in big-endian byte order.
            For example,
                   binary scan \x00\x00\x00\x05\x00\x00\x00\x07\xff\xff\xff\xf0 I2I* var1 var2
            will return 2 with 5 7 stored in var1 and -16 stored in var2.

       w    The data is interpreted as count  64-bit  signed  integers  repre- |
            sented  in  little-endian  byte order.  The integers are stored in |
            the corresponding variable as a list.  If count is *, then all  of |
            the  remaining bytes in string will be scanned.  If count is omit- |
            ted, then one 64-bit integer will be scanned.  For example,        |
                   binary scan \x05\x00\x00\x00\x07\x00\x00\x00\xf0\xff\xff\xff wi* var1 var2|
            will return 2 with 30064771077 stored in var1 and  -16  stored  in |
            var2.   Note  that  the integers returned are signed and cannot be |
            represented by Tcl as unsigned values.                             |

       W                                                                       ||
            This  form is the same as w except that the data is interpreted as |
            count 64-bit signed integers represented in big-endian byte order. |
            For example,                                                       |
                   binary scan \x00\x00\x00\x05\x00\x00\x00\x07\xff\xff\xff\xf0 WI* var1 var2|
            will  return  2  with 21474836487 stored in var1 and -16 stored in |
            var2.

       f    The data is interpreted as count single-precision  floating  point
            numbers  in  the  machine's  native  representation.  The floating
            point numbers are stored in the corresponding variable as a  list.
            If  count  is *, then all of the remaining bytes in string will be
            scanned.  If count is omitted, then one single-precision  floating
            point number will be scanned.  The size of a floating point number
            may vary across architectures, so the number  of  bytes  that  are
            scanned may vary.  If the data does not represent a valid floating
            point number, the resulting value is undefined and compiler depen-
            dent.   For  example, on a Windows system running on an Intel Pen-
            tium processor,
                   binary scan \x3f\xcc\xcc\xcd f var1
            will return 1 with 1.6000000238418579 stored in var1.

       d    This form is the same as f except that the data is interpreted  as
            count  double-precision  floating  point  numbers in the machine's
            native representation. For example, on a Windows system running on
            an Intel Pentium processor,
                   binary scan \x9a\x99\x99\x99\x99\x99\xf9\x3f d var1
            will return 1 with 1.6000000000000001 stored in var1.

       x    Moves  the cursor forward count bytes in string.  If count is * or
            is larger than the number of bytes after the current cursor cursor
            position,  then  the  cursor  is positioned after the last byte in
            string.  If count is omitted, then the cursor is moved forward one
            byte.   Note  that  this  type  does not consume an argument.  For
            example,
                   binary scan \x01\x02\x03\x04 x2H* var1
            will return 1 with 0304 stored in var1.

       X    Moves the cursor back count bytes in string.  If count is * or  is
            larger  than the current cursor position, then the cursor is posi-
            tioned at location 0 so that the next byte  scanned  will  be  the
            first  byte  in  string.   If  count is omitted then the cursor is
            moved back one byte.  Note that this  type  does  not  consume  an
            argument.  For example,
                   binary scan \x01\x02\x03\x04 c2XH* var1 var2
            will return 2 with 1 2 stored in var1 and 020304 stored in var2.

       @    Moves the cursor to the absolute location in the data string spec-
            ified by count.  Note that position 0 refers to the first byte  in
            string.   If  count refers to a position beyond the end of string,
            then the cursor is positioned after the last byte.   If  count  is
            omitted, then an error will be generated.  For example,
                   binary scan \x01\x02\x03\x04 c2@1H* var1 var2
            will return 2 with 1 2 stored in var1 and 020304 stored in var2.


PLATFORM ISSUES

       Sometimes  it  is  desirable  to  format  or scan integer values in the
       native byte order for the machine.  Refer to the byteOrder  element  of
       the  tcl_platform array to decide which type character to use when for-
       matting or scanning integers.


EXAMPLES

       This is a procedure to write a Tcl string to a  binary-encoded  channel
       as UTF-8 data preceded by a length word:
              proc writeString {channel string} {
                  set data [encoding convertto utf-8 $string]
                  puts -nonewline [binary format Ia* \
                          [string length $data] $data]
              }

       This  procedure  reads  a string from a channel that was written by the
       previously presented writeString procedure:
              proc readString {channel} {
                  if {![binary scan [read $channel 4] I length]} {
                      error "missing length"
                  }
                  set data [read $channel $length]
                  return [encoding convertfrom utf-8 $data]
              }


SEE ALSO

       format(n), scan(n), tclvars(n)


KEYWORDS

       binary, format, scan

Tcl                                   8.0                            binary(n)

Man(1) output converted with man2html