== \IO Streams Ruby supports processing data as \IO streams; that is, as data that may be read, re-read, written, re-written, and traversed via iteration. Core classes with such support include: - IO, and its derived class File. - {StringIO}[rdoc-ref:StringIO]: for processing a string. - {ARGF}[rdoc-ref:ARGF]: for processing files cited on the command line. Pre-existing stream objects that are referenced by constants include: - $stdin: read-only instance of \IO. - $stdout: write-only instance of \IO. - $stderr: read-only instance of \IO. - \ARGF: read-only instance of \ARGF. You can create stream objects: - \File: - File.new: returns a new \File object. - File.open: passes a new \File object to given the block. - \IO: - IO.new: returns a new \IO object for the given integer file descriptor. - IO.open: passes a new \IO object to the given block. - IO.popen: returns a new \IO object that is connected to the $stdin and $stdout of a newly-launched subprocess. - Kernel#open: returns a new \IO object connected to a given source: stream, file, or subprocess. - \StringIO: - StringIO.new: returns a new \StringIO object. - StringIO.open: passes a new \StringIO object to the given block. (You cannot create an \ARGF object, but one already exists.) === About the Examples Many examples here use these variables: # English text with newlines. text = <<~EOT First line Second line Fourth line Fifth line EOT # Russian text. russian = "\u{442 435 441 442}" # => "ั‚ะตัั‚" # Binary data. data = "\u9990\u9991\u9992\u9993\u9994" # Text file. File.write('t.txt', text) # File with Russian text. File.write('t.rus', russian) # File with binary data. f = File.new('t.dat', 'wb:UTF-16') f.write(data) f.close === Position An \IO stream has a nonnegative integer _position_, which is the byte offset at which the next read or write is to occur; the relevant methods: - +#tell+ (aliased as #pos): Returns the current position (in bytes) in the stream: f = File.new('t.txt') f.tell # => 0 f.gets # => "First line\n" f.tell # => 12 f.close - +#pos=+: Sets the position of the stream (in bytes): f = File.new('t.txt') f.tell # => 0 f.pos = 20 # => 20 f.tell # => 20 f.close - +#seek+: Sets the position of the stream to a given integer +offset+ (in bytes), with respect to a given constant +whence+, which is one of: - +:CUR+ or IO::SEEK_CUR: Repositions the stream to its current position plus the given +offset+: f = File.new('t.txt') f.tell # => 0 f.seek(20, :CUR) # => 0 f.tell # => 20 f.seek(-10, :CUR) # => 0 f.tell # => 10 f.close - +:END+ or IO::SEEK_END: Repositions the stream to its end plus the given +offset+: f = File.new('t.txt') f.tell # => 0 f.seek(0, :END) # => 0 # Repositions to stream end. f.tell # => 52 f.seek(-20, :END) # => 0 f.tell # => 32 f.seek(-40, :END) # => 0 f.tell # => 12 f.close - +:SET+ or IO:SEEK_SET: Repositions the stream to the given +offset+: f = File.new('t.txt') f.tell # => 0 f.seek(20, :SET) # => 0 f.tell # => 20 f.seek(40, :SET) # => 0 f.tell # => 40 f.close - +#rewind+: Positions the stream to the beginning: f = File.new('t.txt') f.tell # => 0 f.gets # => "First line\n" f.tell # => 12 f.rewind # => 0 f.tell # => 0 f.close === Lines Some reader methods in \IO streams are line-oriented; such a method reads one or more lines, which are separated by an implicit or explicit line separator. These methods are included (except as noted) in classes Kernel, IO, File, and {ARGF}[rdoc-ref:ARGF]: - +#each_line+ - passes each line to the block; not in Kernel: f = File.new('t.txt') f.each_line {|line| p line } Output: "First line\n" "Second line\n" "\n" "Fourth line\n" "Fifth line\n" The reading may begin mid-line: f = File.new('t.txt') f.pos = 27 f.each_line {|line| p line } Output: "rth line\n" "Fifth line\n" - +#gets+ - returns the next line (which may begin mid-line): f = File.new('t.txt') f.gets # => "First line\n" f.gets # => "Second line\n" f.pos = 27 f.gets # => "rth line\n" f.readlines # => ["Fifth line\n"] f.gets # => nil - +#readline+ - like #gets, but raises an exception at end-of-file; not in StringIO. - +#readlines+ - returns all remaining lines in an array; may begin mid-line: f = File.new('t.txt') f.pos = 19 f.readlines # => ["ine\n", "\n", "Fourth line\n", "Fifth line\n"] f.readlines # => [] Each of these methods may be called with: - An optional line separator, +sep+. - An optional line-size limit, +limit+. - Both +sep+ and +limit+. ==== Line Separator The default line separator is the given by the global variable $/, whose value is by default "\n". The line to be read next is all data from the current position to the next line separator: f = File.new('t.txt') f.gets # => "First line\n" f.gets # => "Second line\n" f.gets # => "\n" f.gets # => "Fourth line\n" f.gets # => "Fifth line\n" f.close You can specify a different line separator: f = File.new('t.txt') f.gets('l') # => "First l" f.gets('li') # => "ine\nSecond li" f.gets('lin') # => "ne\n\nFourth lin" f.gets # => "e\n" f.close There are two special line separators: - +nil+: The entire stream is read into a single string: f = File.new('t.txt') f.gets(nil) # => "First line\nSecond line\n\nFourth line\nFifth line\n" f.close - '' (the empty string): The next "paragraph" is read (paragraphs being separated by two consecutive line separators): f = File.new('t.txt') f.gets('') # => "First line\nSecond line\n\n" f.gets('') # => "Fourth line\nFifth line\n" f.close ==== Line Limit The line to be read may be further defined by an optional integer argument +limit+, which specifies that the number of bytes returned may not be (much) longer than the given +limit+; a multi-byte character will not be split, and so a line may be slightly longer than the given limit. If +limit+ is not given, the line is determined only by +sep+. # Text with 1-byte characters. File.new('t.txt') {|f| f.gets(1) } # => "F" File.new('t.txt') {|f| f.gets(2) } # => "Fi" File.new('t.txt') {|f| f.gets(3) } # => "Fir" File.new('t.txt') {|f| f.gets(4) } # => "Firs" # No more than one line. File.new('t.txt') {|f| f.gets(10) } # => "First line" File.new('t.txt') {|f| f.gets(11) } # => "First line\n" File.new('t.txt') {|f| f.gets(12) } # => "First line\n" # Text with 2-byte characters, which will not be split. File.new('r.rus') {|f| f.gets(1).size } # => 1 File.new('r.rus') {|f| f.gets(2).size } # => 1 File.new('r.rus') {|f| f.gets(3).size } # => 2 File.new('r.rus') {|f| f.gets(4).size } # => 2 ==== Line Separator and Line Limit With arguments +sep+ and +limit+ given, combines the two behaviors: - Returns the next line as determined by line separator +sep+. - But returns no more bytes than are allowed by the limit. Example: File.new('t.txt') {|f| f.gets('li', 20) } # => "First li" File.new('t.txt') {|f| f.gets('li', 2) } # => "Fi" ==== Line Number A readable \IO stream has a _line_ _number_, which is the non-negative integer line number in the stream where the next read will occur. A new stream is initially has line number +0+. \Method IO#lineno returns the line number. Reading lines from a stream usually changes its line number: f = File.new('t.txt', 'r') f.lineno # => 0 f.readline # => "This is line one.\n" f.lineno # => 1 f.readline # => "This is the second line.\n" f.lineno # => 2 f.readline # => "Here's the third line.\n" f.lineno # => 3 f.eof? # => true f.close Iterating over lines in a stream usually changes its line number: f = File.new('t.txt') f.each_line do |line| p "position=#{f.pos} eof?=#{f.eof?} lineno=#{f.lineno}" end f.close Output: "position=11 eof?=false lineno=1" "position=23 eof?=false lineno=2" "position=24 eof?=false lineno=3" "position=36 eof?=false lineno=4" "position=47 eof?=true lineno=5" ==== Line Options A number of \IO methods accept optional keyword arguments that determine how lines in a stream are to be treated: - +:chomp+: If +true+, line separators are omitted; default is +false+. === Open and Closed \IO Streams A new \IO stream may be open for reading, open for writing, or both. You can close a stream using these methods: - +#close+ - closes the stream for both reading and writing. - +#close_read+ (not available in \ARGF) - closes the stream for reading. - +#close_write+ (not available in \ARGF) - closes the stream for writing. You can query whether a stream is closed using these methods: - +#closed?+ - returns whether the stream is closed. === Stream End-of-File You can query whether a stream is at end-of-file using this method: - +#eof?+ (also aliased as +#eof+) - returns whether the stream is at end-of-file.