Eio.Buf_readSourceBuffered input and parsing.
This module provides fairly efficient non-backtracking parsers. It is modelled on Angstrom's API, and you should use that if backtracking is needed.
Example:
let r = Buf_read.of_flow flow ~max_size:1_000_000 in
Buf_read.line rAn input buffer.
Raised if parsing an item would require enlarging the buffer beyond its configured limit.
An 'a parser is a function that consumes and returns a value of type 'a.
val parse :
?initial_size:int ->
max_size:int ->
'a parser ->
_ Flow.source ->
('a, [> `Msg of string ]) resultparse p flow ~max_size uses p to parse everything in flow.
It is a convenience function that does
let buf = of_flow flow ~max_size in
format_errors (p <* end_of_input) bufparse_exn wraps parse, but raises Failure msg if that returns Error (`Msg msg).
Catching exceptions with parse and then raising them might seem pointless, but this has the effect of turning e.g. an End_of_file exception into a Failure with a more user-friendly message.
parse_string p s uses p to parse everything in s. It is defined as format_errors (p <* end_of_input) (of_string s)
parse_string_exn is like parse_string, but handles errors like parse_exn.
of_flow ~max_size flow is a buffered reader backed by flow.
of_buffer buf is a reader that reads from buf. buf is used directly, without being copied. eof_seen (of_buffer buf) = true. This module will not modify buf itself, but it will expose it via peek.
as_flow t is a buffered flow.
Reading from it will return data from the buffer, only reading the underlying flow if the buffer is empty.
line parses one line.
Lines can be terminated by either LF or CRLF. The returned string does not include the terminator.
If End_of_file is reached after seeing some data but before seeing a line terminator, the data seen is returned as the last line.
lines returns a sequence that lazily reads the next line until the end of the input is reached.
lines = seq line ~stop:at_end_of_input
peek_char returns Some c where c is the next character, but does not consume it.
Returns None at the end of the input stream rather than raising End_of_file.
string s checks that s is the next string in the stream and consumes it.
take_all takes all remaining data until end-of-file.
Returns "" if already at end-of-file.
take_while p finds the first byte for which p is false and consumes and returns all bytes before that.
If p is true for all remaining bytes, it returns everything until end-of-file.
It will return the empty string if there are no matching characters (and therefore never raises End_of_file).
take_while1 p is like take_while. However, the parser fails with "take_while1" if at least one character of input hasn't been consumed by the parser.
skip_while p skips zero or more bytes for which p is true.
skip_while p t does the same thing as ignore (take_while p t), except that it is not limited by the buffer size.
skip_while1 p is like skip_while. However, the parser fails with "skip_while1" if at least one character of input hasn't been skipped.
skip n discards the next n bytes.
skip n = map ignore (take n), except that the number of skipped bytes may be larger than the buffer (it will not grow).
Note: if End_of_file is raised, all bytes in the stream will have been consumed.
at_end_of_input returns true when at the end of the stream, or false if there is at least one more byte to be read.
end_of_input checks that there are no further bytes in the stream.
seq p is a sequence that uses p to get the next item.
A sequence node can only be used while the stream is at the expected position, and will raise Invalid_argument if any bytes have been consumed in the meantime. This also means that each node can only be used once; use Seq.memoize to make the sequence persistent.
It is not necessary to consume all the elements of the sequence.
Example (head 4 is a parser that takes 4 lines):
let head n r =
r |> Buf_read.(seq line) |> Seq.take n |> List.of_seqpair a b is a parser that first uses a to parse a value x, then uses b to parse a value y, then returns (x, y).
Note that this module does not support backtracking, so if b fails then the bytes consumed by a are lost.
return x is a parser that consumes nothing and always returns x. return is just Fun.const.
map f a is a parser that parses the stream with a to get v, and then returns f v.
bind a f is a parser that first uses a to parse a value v, then uses f v to select the next parser, and then uses that.
format_errors p catches Failure, End_of_file and Buffer_limit_exceeded exceptions and returns them as a formatted error message.
buffered_bytes t is the number of bytes that can be read without reading from the underlying flow.
peek t returns a view onto the active part of t's internal buffer.
Performing any operation that might add to the buffer may invalidate this, so it should be used immediately and then forgotten.
Cstruct.length (peek t) = buffered_bytes t.
ensure t n ensures that the buffer contains at least n bytes of data.
If not, it reads from the flow until there is.
buffered_bytes (ensure t n) >= n.
consumed_bytes t is the total number of bytes consumed.
i.e. it is the offset into the stream of the next byte to be parsed.
eof_seen t indicates whether we've received End_of_file from the underlying flow.
If so, there will never be any further data beyond what peek already returns.
Note that this returns false if we're at the end of the stream but don't know it yet. Use at_end_of_input to be sure.