summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2023-04-07 14:40:03 +0300
committerArnold D. Robbins <arnold@skeeve.com>2023-04-07 14:40:03 +0300
commit2003b18129d4eb24011f9b39eb35c79598daf546 (patch)
tree52ac01766fcf312bb49d189b1f35c244a0c61e9d
parente63e393634006ca2e94f0d0715e486193d82ae66 (diff)
downloadgawk-2003b18129d4eb24011f9b39eb35c79598daf546.tar.gz
Improve CSV record handling.
-rw-r--r--ChangeLog6
-rw-r--r--doc/ChangeLog6
-rw-r--r--doc/gawk.info1057
-rw-r--r--doc/gawk.texi6
-rw-r--r--doc/gawktexi.in3
-rw-r--r--io.c19
6 files changed, 556 insertions, 541 deletions
diff --git a/ChangeLog b/ChangeLog
index b72bf1d3..9c5945a0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2023-04-07 Andrew J. Schorr <aschorr@telemetry-investments.com>
+
+ * io.c (csvscan): Instead of stripping all carriage returns in the
+ input, simply include the CR in the RT if it occurs just before
+ the linefeed character.
+
2023-04-07 Arnold D. Robbins <arnold@skeeve.com>
* array.c (assoc_info): Update to handle additional cases so
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 5225862a..13e2f082 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,9 @@
+2023-04-06 Andrew J. Schorr <aschorr@telemetry-investments.com>
+
+ * gawktexi.in (Carriage-Return--Line-Feed Line Endings In CSV Files):
+ Revise to explain that carriage-returns will be included in RT
+ when they appear just before a line-feed.
+
2023-04-04 Arnold D. Robbins <arnold@skeeve.com>
* gawktexi.in (Controlling Scanning): Fix the logic in the example.
diff --git a/doc/gawk.info b/doc/gawk.info
index 754378a5..e3bcc9e7 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -5332,7 +5332,8 @@ Records::).
Many CSV files are imported from systems where the line terminator
for text files is a carriage-return–line-feed pair (CR-LF, ‘\r’ followed
by ‘\n’). For ease of use, when processing CSV files, ‘gawk’ simply
-strips out any carriage-return characters in the input.
+includes the carriage-return character in the record terminator when it
+occurs immediately prior to a line-feed character in the input.
The behavior of the ‘split()’ function (not formally discussed yet,
see *note String Functions::) differs slightly when processing CSV
@@ -39663,533 +39664,533 @@ Node: Regexp Field Splitting243342
Node: Single Character Fields247171
Node: Comma Separated Fields248260
Ref: table-csv-examples249679
-Node: Command Line Field Separator251609
-Node: Full Line Fields254995
-Ref: Full Line Fields-Footnote-1256575
-Ref: Full Line Fields-Footnote-2256621
-Node: Field Splitting Summary256729
-Node: Constant Size259163
-Node: Fixed width data259907
-Node: Skipping intervening263426
-Node: Allowing trailing data264228
-Node: Fields with fixed data265293
-Node: Splitting By Content266919
-Ref: Splitting By Content-Footnote-1271188
-Node: More CSV271351
-Node: FS versus FPAT273004
-Node: Testing field creation274213
-Node: Multiple Line275991
-Node: Getline282473
-Node: Plain Getline285059
-Node: Getline/Variable287709
-Node: Getline/File288906
-Node: Getline/Variable/File290354
-Ref: Getline/Variable/File-Footnote-1291999
-Node: Getline/Pipe292095
-Node: Getline/Variable/Pipe294908
-Node: Getline/Coprocess296091
-Node: Getline/Variable/Coprocess297414
-Node: Getline Notes298180
-Node: Getline Summary301141
-Ref: table-getline-variants301585
-Node: Read Timeout302490
-Ref: Read Timeout-Footnote-1306454
-Node: Retrying Input306512
-Node: Command-line directories307779
-Node: Input Summary308717
-Node: Input Exercises312097
-Node: Printing312537
-Node: Print314480
-Node: Print Examples315986
-Node: Output Separators318839
-Node: OFMT320950
-Node: Printf322373
-Node: Basic Printf323178
-Node: Control Letters324814
-Node: Format Modifiers330283
-Node: Printf Examples336569
-Node: Redirection339114
-Node: Special FD346188
-Ref: Special FD-Footnote-1349478
-Node: Special Files349564
-Node: Other Inherited Files350193
-Node: Special Network351258
-Node: Special Caveats352146
-Node: Close Files And Pipes353129
-Ref: Close Files And Pipes-Footnote-1359265
-Node: Close Return Value359421
-Ref: table-close-pipe-return-values360696
-Ref: Close Return Value-Footnote-1361530
-Node: Noflush361686
-Node: Nonfatal363198
-Node: Output Summary365615
-Node: Output Exercises366901
-Node: Expressions367592
-Node: Values368794
-Node: Constants369472
-Node: Scalar Constants370169
-Ref: Scalar Constants-Footnote-1372747
-Ref: Scalar Constants-Footnote-2372997
-Node: Nondecimal-numbers373077
-Node: Regexp Constants376198
-Node: Using Constant Regexps376744
-Node: Standard Regexp Constants377390
-Node: Strong Regexp Constants380690
-Node: Variables384541
-Node: Using Variables385206
-Node: Assignment Options387186
-Node: Conversion389748
-Node: Strings And Numbers390280
-Ref: Strings And Numbers-Footnote-1393499
-Node: Locale influences conversions393608
-Ref: table-locale-affects396458
-Node: All Operators397101
-Node: Arithmetic Ops397742
-Node: Concatenation400572
-Ref: Concatenation-Footnote-1403522
-Node: Assignment Ops403645
-Ref: table-assign-ops408784
-Node: Increment Ops410166
-Node: Truth Values and Conditions413765
-Node: Truth Values414891
-Node: Typing and Comparison415982
-Node: Variable Typing416818
-Ref: Variable Typing-Footnote-1423480
-Ref: Variable Typing-Footnote-2423560
-Node: Comparison Operators423643
-Ref: table-relational-ops424070
-Node: POSIX String Comparison427756
-Ref: POSIX String Comparison-Footnote-1429515
-Ref: POSIX String Comparison-Footnote-2429658
-Node: Boolean Ops429742
-Ref: Boolean Ops-Footnote-1434435
-Node: Conditional Exp434531
-Node: Function Calls436317
-Node: Precedence440267
-Node: Locales444144
-Node: Expressions Summary445826
-Node: Patterns and Actions448489
-Node: Pattern Overview449631
-Node: Regexp Patterns451357
-Node: Expression Patterns451903
-Node: Ranges455812
-Node: BEGIN/END458990
-Node: Using BEGIN/END459803
-Ref: Using BEGIN/END-Footnote-1462713
-Node: I/O And BEGIN/END462823
-Node: BEGINFILE/ENDFILE465304
-Node: Empty468745
-Node: Using Shell Variables469062
-Node: Action Overview471400
-Node: Statements473835
-Node: If Statement475733
-Node: While Statement477302
-Node: Do Statement479390
-Node: For Statement480576
-Node: Switch Statement483933
-Node: Break Statement486484
-Node: Continue Statement488676
-Node: Next Statement490608
-Node: Nextfile Statement493105
-Node: Exit Statement495966
-Node: Built-in Variables498499
-Node: User-modified499678
-Node: Auto-set507889
-Ref: Auto-set-Footnote-1525988
-Ref: Auto-set-Footnote-2526206
-Node: ARGC and ARGV526262
-Node: Pattern Action Summary530701
-Node: Arrays533317
-Node: Array Basics534694
-Node: Array Intro535544
-Ref: figure-array-elements537560
-Ref: Array Intro-Footnote-1540429
-Node: Reference to Elements540561
-Node: Assigning Elements543083
-Node: Array Example543578
-Node: Scanning an Array545547
-Node: Controlling Scanning548644
-Ref: Controlling Scanning-Footnote-1555290
-Node: Numeric Array Subscripts555614
-Node: Uninitialized Subscripts557888
-Node: Delete559567
-Ref: Delete-Footnote-1562381
-Node: Multidimensional562438
-Node: Multiscanning565643
-Node: Arrays of Arrays567315
-Node: Arrays Summary572215
-Node: Functions574404
-Node: Built-in575464
-Node: Calling Built-in576653
-Node: Boolean Functions578700
-Node: Numeric Functions579270
-Ref: Numeric Functions-Footnote-1583463
-Ref: Numeric Functions-Footnote-2584147
-Ref: Numeric Functions-Footnote-3584199
-Node: String Functions584475
-Ref: String Functions-Footnote-1610706
-Ref: String Functions-Footnote-2610840
-Ref: String Functions-Footnote-3611100
-Node: Gory Details611187
-Ref: table-sub-escapes613094
-Ref: table-sub-proposed614740
-Ref: table-posix-sub616250
-Ref: table-gensub-escapes617938
-Ref: Gory Details-Footnote-1618872
-Node: I/O Functions619026
-Ref: table-system-return-values625713
-Ref: I/O Functions-Footnote-1627884
-Ref: I/O Functions-Footnote-2628032
-Node: Time Functions628152
-Ref: Time Functions-Footnote-1639308
-Ref: Time Functions-Footnote-2639384
-Ref: Time Functions-Footnote-3639546
-Ref: Time Functions-Footnote-4639657
-Ref: Time Functions-Footnote-5639775
-Ref: Time Functions-Footnote-6640010
-Node: Bitwise Functions640292
-Ref: table-bitwise-ops640894
-Ref: Bitwise Functions-Footnote-1647148
-Ref: Bitwise Functions-Footnote-2647327
-Node: Type Functions647524
-Node: I18N Functions651117
-Node: User-defined652860
-Node: Definition Syntax653680
-Ref: Definition Syntax-Footnote-1659508
-Node: Function Example659585
-Ref: Function Example-Footnote-1662564
-Node: Function Calling662586
-Node: Calling A Function663180
-Node: Variable Scope664150
-Node: Pass By Value/Reference667204
-Node: Function Caveats669936
-Ref: Function Caveats-Footnote-1672031
-Node: Return Statement672155
-Node: Dynamic Typing675210
-Node: Indirect Calls677602
-Node: Functions Summary688761
-Node: Library Functions691538
-Ref: Library Functions-Footnote-1695086
-Ref: Library Functions-Footnote-2695229
-Node: Library Names695404
-Ref: Library Names-Footnote-1699198
-Ref: Library Names-Footnote-2699425
-Node: General Functions699521
-Node: Strtonum Function700715
-Node: Assert Function703797
-Node: Round Function707249
-Node: Cliff Random Function708827
-Node: Ordinal Functions709860
-Ref: Ordinal Functions-Footnote-1712969
-Ref: Ordinal Functions-Footnote-2713221
-Node: Join Function713435
-Ref: Join Function-Footnote-1715238
-Node: Getlocaltime Function715442
-Node: Readfile Function719216
-Node: Shell Quoting721245
-Node: Isnumeric Function722701
-Node: Data File Management724113
-Node: Filetrans Function724745
-Node: Rewind Function729039
-Node: File Checking731018
-Ref: File Checking-Footnote-1732390
-Node: Empty Files732597
-Node: Ignoring Assigns734664
-Node: Getopt Function736238
-Ref: Getopt Function-Footnote-1752072
-Node: Passwd Functions752284
-Ref: Passwd Functions-Footnote-1761466
-Node: Group Functions761554
-Ref: Group Functions-Footnote-1769692
-Node: Walking Arrays769905
-Node: Library Functions Summary772953
-Node: Library Exercises774377
-Node: Sample Programs774864
-Node: Running Examples775646
-Node: Clones776398
-Node: Cut Program777670
-Node: Egrep Program788111
-Node: Id Program797428
-Node: Split Program807542
-Ref: Split Program-Footnote-1817777
-Node: Tee Program817964
-Node: Uniq Program820873
-Node: Wc Program828738
-Node: Bytes vs. Characters829133
-Node: Using extensions830735
-Node: wc program831515
-Node: Miscellaneous Programs836521
-Node: Dupword Program837750
-Node: Alarm Program839813
-Node: Translate Program844726
-Ref: Translate Program-Footnote-1849467
-Node: Labels Program849745
-Ref: Labels Program-Footnote-1853186
-Node: Word Sorting853278
-Node: History Sorting857472
-Node: Extract Program859747
-Node: Simple Sed868016
-Node: Igawk Program871232
-Ref: Igawk Program-Footnote-1886479
-Ref: Igawk Program-Footnote-2886685
-Ref: Igawk Program-Footnote-3886815
-Node: Anagram Program886942
-Node: Signature Program890038
-Node: Programs Summary891290
-Node: Programs Exercises892548
-Ref: Programs Exercises-Footnote-1896864
-Node: Advanced Features896950
-Node: Nondecimal Data899444
-Node: Boolean Typed Values901074
-Node: Array Sorting903049
-Node: Controlling Array Traversal903778
-Ref: Controlling Array Traversal-Footnote-1912285
-Node: Array Sorting Functions912407
-Ref: Array Sorting Functions-Footnote-1918526
-Node: Two-way I/O918734
-Ref: Two-way I/O-Footnote-1926729
-Ref: Two-way I/O-Footnote-2926920
-Node: TCP/IP Networking927002
-Node: Profiling930182
-Node: Persistent Memory939892
-Ref: Persistent Memory-Footnote-1948850
-Node: Extension Philosophy948981
-Node: Advanced Features Summary950516
-Node: Internationalization952786
-Node: I18N and L10N954492
-Node: Explaining gettext955187
-Ref: Explaining gettext-Footnote-1961340
-Ref: Explaining gettext-Footnote-2961535
-Node: Programmer i18n961700
-Ref: Programmer i18n-Footnote-1966813
-Node: Translator i18n966862
-Node: String Extraction967698
-Ref: String Extraction-Footnote-1968876
-Node: Printf Ordering968974
-Ref: Printf Ordering-Footnote-1971836
-Node: I18N Portability971904
-Ref: I18N Portability-Footnote-1974478
-Node: I18N Example974549
-Ref: I18N Example-Footnote-1977949
-Ref: I18N Example-Footnote-2978025
-Node: Gawk I18N978142
-Node: I18N Summary978798
-Node: Debugger980199
-Node: Debugging981223
-Node: Debugging Concepts981672
-Node: Debugging Terms983498
-Node: Awk Debugging986111
-Ref: Awk Debugging-Footnote-1987088
-Node: Sample Debugging Session987228
-Node: Debugger Invocation987780
-Node: Finding The Bug989409
-Node: List of Debugger Commands996095
-Node: Breakpoint Control997472
-Node: Debugger Execution Control1001304
-Node: Viewing And Changing Data1004784
-Node: Execution Stack1008522
-Node: Debugger Info1010203
-Node: Miscellaneous Debugger Commands1014502
-Node: Readline Support1019755
-Node: Limitations1020701
-Node: Debugging Summary1023345
-Node: Namespaces1024648
-Node: Global Namespace1025775
-Node: Qualified Names1027220
-Node: Default Namespace1028255
-Node: Changing The Namespace1029030
-Node: Naming Rules1030724
-Node: Internal Name Management1032639
-Node: Namespace Example1033709
-Node: Namespace And Features1036292
-Node: Namespace Summary1037749
-Node: Arbitrary Precision Arithmetic1039262
-Node: Computer Arithmetic1040781
-Ref: table-numeric-ranges1044598
-Ref: table-floating-point-ranges1045096
-Ref: Computer Arithmetic-Footnote-11045755
-Node: Math Definitions1045814
-Ref: table-ieee-formats1048859
-Node: MPFR features1049433
-Node: MPFR On Parole1049886
-Ref: MPFR On Parole-Footnote-11050730
-Node: MPFR Intro1050889
-Node: FP Math Caution1052579
-Ref: FP Math Caution-Footnote-11053653
-Node: Inexactness of computations1054030
-Node: Inexact representation1055061
-Node: Comparing FP Values1056444
-Node: Errors accumulate1057702
-Node: Strange values1059169
-Ref: Strange values-Footnote-11061835
-Node: Getting Accuracy1061940
-Node: Try To Round1064677
-Node: Setting precision1065584
-Ref: table-predefined-precision-strings1066289
-Node: Setting the rounding mode1068174
-Ref: table-gawk-rounding-modes1068556
-Ref: Setting the rounding mode-Footnote-11072614
-Node: Arbitrary Precision Integers1072797
-Ref: Arbitrary Precision Integers-Footnote-11076009
-Node: Checking for MPFR1076165
-Node: POSIX Floating Point Problems1077655
-Ref: POSIX Floating Point Problems-Footnote-11082519
-Node: Floating point summary1082557
-Node: Dynamic Extensions1084821
-Node: Extension Intro1086420
-Node: Plugin License1087728
-Node: Extension Mechanism Outline1088541
-Ref: figure-load-extension1088992
-Ref: figure-register-new-function1090577
-Ref: figure-call-new-function1091687
-Node: Extension API Description1093811
-Node: Extension API Functions Introduction1095540
-Ref: table-api-std-headers1097438
-Node: General Data Types1101902
-Ref: General Data Types-Footnote-11111070
-Node: Memory Allocation Functions1111385
-Ref: Memory Allocation Functions-Footnote-11116110
-Node: Constructor Functions1116209
-Node: API Ownership of MPFR and GMP Values1120114
-Node: Registration Functions1121675
-Node: Extension Functions1122379
-Node: Exit Callback Functions1127955
-Node: Extension Version String1129274
-Node: Input Parsers1129969
-Node: Output Wrappers1144613
-Node: Two-way processors1149461
-Node: Printing Messages1151822
-Ref: Printing Messages-Footnote-11153036
-Node: Updating ERRNO1153191
-Node: Requesting Values1153990
-Ref: table-value-types-returned1154743
-Node: Accessing Parameters1155852
-Node: Symbol Table Access1157136
-Node: Symbol table by name1157652
-Ref: Symbol table by name-Footnote-11160863
-Node: Symbol table by cookie1160995
-Ref: Symbol table by cookie-Footnote-11165276
-Node: Cached values1165340
-Ref: Cached values-Footnote-11168984
-Node: Array Manipulation1169141
-Ref: Array Manipulation-Footnote-11170244
-Node: Array Data Types1170281
-Ref: Array Data Types-Footnote-11173103
-Node: Array Functions1173203
-Node: Flattening Arrays1178232
-Node: Creating Arrays1185284
-Node: Redirection API1190134
-Node: Extension API Variables1193155
-Node: Extension Versioning1193880
-Ref: gawk-api-version1194317
-Node: Extension GMP/MPFR Versioning1196105
-Node: Extension API Informational Variables1197811
-Node: Extension API Boilerplate1198972
-Node: Changes from API V11203108
-Node: Finding Extensions1204742
-Node: Extension Example1205317
-Node: Internal File Description1206141
-Node: Internal File Ops1210465
-Ref: Internal File Ops-Footnote-11222023
-Node: Using Internal File Ops1222171
-Ref: Using Internal File Ops-Footnote-11224602
-Node: Extension Samples1224880
-Node: Extension Sample File Functions1226449
-Node: Extension Sample Fnmatch1234587
-Node: Extension Sample Fork1236182
-Node: Extension Sample Inplace1237458
-Node: Extension Sample Ord1241130
-Node: Extension Sample Readdir1242006
-Ref: table-readdir-file-types1242903
-Node: Extension Sample Revout1244041
-Node: Extension Sample Rev2way1244638
-Node: Extension Sample Read write array1245390
-Node: Extension Sample Readfile1248664
-Node: Extension Sample Time1249795
-Node: Extension Sample API Tests1252085
-Node: gawkextlib1252593
-Node: Extension summary1255629
-Node: Extension Exercises1259487
-Node: Language History1260765
-Node: V7/SVR3.11262479
-Node: SVR41264829
-Node: POSIX1266361
-Node: BTL1267786
-Node: POSIX/GNU1268555
-Node: Feature History1275086
-Node: Common Extensions1294653
-Node: Ranges and Locales1296130
-Ref: Ranges and Locales-Footnote-11300931
-Ref: Ranges and Locales-Footnote-21300958
-Ref: Ranges and Locales-Footnote-31301197
-Node: Contributors1301420
-Node: History summary1307625
-Node: Installation1309071
-Node: Gawk Distribution1310035
-Node: Getting1310527
-Node: Extracting1311526
-Node: Distribution contents1313238
-Node: Unix Installation1321318
-Node: Quick Installation1322140
-Node: Compiling with MPFR1324686
-Node: Shell Startup Files1325392
-Node: Additional Configuration Options1326549
-Node: Configuration Philosophy1328936
-Node: Compiling from Git1331438
-Node: Building the Documentation1331997
-Node: Non-Unix Installation1333409
-Node: PC Installation1333885
-Node: PC Binary Installation1334758
-Node: PC Compiling1335663
-Node: PC Using1336841
-Node: Cygwin1340569
-Node: MSYS1341825
-Node: OpenVMS Installation1342457
-Node: OpenVMS Compilation1343138
-Ref: OpenVMS Compilation-Footnote-11344621
-Node: OpenVMS Dynamic Extensions1344683
-Node: OpenVMS Installation Details1346319
-Node: OpenVMS Running1348754
-Node: OpenVMS GNV1352891
-Node: Bugs1353646
-Node: Bug definition1354570
-Node: Bug address1358221
-Node: Usenet1361812
-Node: Performance bugs1363043
-Node: Asking for help1366061
-Node: Maintainers1368052
-Node: Other Versions1369079
-Node: Installation summary1378011
-Node: Notes1379395
-Node: Compatibility Mode1380205
-Node: Additions1381027
-Node: Accessing The Source1381972
-Node: Adding Code1383507
-Node: New Ports1390643
-Node: Derived Files1395153
-Ref: Derived Files-Footnote-11401000
-Ref: Derived Files-Footnote-21401035
-Ref: Derived Files-Footnote-31401652
-Node: Future Extensions1401766
-Node: Implementation Limitations1402438
-Node: Extension Design1403680
-Node: Old Extension Problems1404844
-Ref: Old Extension Problems-Footnote-11406420
-Node: Extension New Mechanism Goals1406481
-Ref: Extension New Mechanism Goals-Footnote-11409977
-Node: Extension Other Design Decisions1410178
-Node: Extension Future Growth1412377
-Node: Notes summary1413001
-Node: Basic Concepts1414214
-Node: Basic High Level1414899
-Ref: figure-general-flow1415181
-Ref: figure-process-flow1415888
-Ref: Basic High Level-Footnote-11419289
-Node: Basic Data Typing1419478
-Node: Glossary1422896
-Node: Copying1456018
-Node: GNU Free Documentation License1493779
-Node: Index1519102
+Node: Command Line Field Separator251689
+Node: Full Line Fields255075
+Ref: Full Line Fields-Footnote-1256655
+Ref: Full Line Fields-Footnote-2256701
+Node: Field Splitting Summary256809
+Node: Constant Size259243
+Node: Fixed width data259987
+Node: Skipping intervening263506
+Node: Allowing trailing data264308
+Node: Fields with fixed data265373
+Node: Splitting By Content266999
+Ref: Splitting By Content-Footnote-1271268
+Node: More CSV271431
+Node: FS versus FPAT273084
+Node: Testing field creation274293
+Node: Multiple Line276071
+Node: Getline282553
+Node: Plain Getline285139
+Node: Getline/Variable287789
+Node: Getline/File288986
+Node: Getline/Variable/File290434
+Ref: Getline/Variable/File-Footnote-1292079
+Node: Getline/Pipe292175
+Node: Getline/Variable/Pipe294988
+Node: Getline/Coprocess296171
+Node: Getline/Variable/Coprocess297494
+Node: Getline Notes298260
+Node: Getline Summary301221
+Ref: table-getline-variants301665
+Node: Read Timeout302570
+Ref: Read Timeout-Footnote-1306534
+Node: Retrying Input306592
+Node: Command-line directories307859
+Node: Input Summary308797
+Node: Input Exercises312177
+Node: Printing312617
+Node: Print314560
+Node: Print Examples316066
+Node: Output Separators318919
+Node: OFMT321030
+Node: Printf322453
+Node: Basic Printf323258
+Node: Control Letters324894
+Node: Format Modifiers330363
+Node: Printf Examples336649
+Node: Redirection339194
+Node: Special FD346268
+Ref: Special FD-Footnote-1349558
+Node: Special Files349644
+Node: Other Inherited Files350273
+Node: Special Network351338
+Node: Special Caveats352226
+Node: Close Files And Pipes353209
+Ref: Close Files And Pipes-Footnote-1359345
+Node: Close Return Value359501
+Ref: table-close-pipe-return-values360776
+Ref: Close Return Value-Footnote-1361610
+Node: Noflush361766
+Node: Nonfatal363278
+Node: Output Summary365695
+Node: Output Exercises366981
+Node: Expressions367672
+Node: Values368874
+Node: Constants369552
+Node: Scalar Constants370249
+Ref: Scalar Constants-Footnote-1372827
+Ref: Scalar Constants-Footnote-2373077
+Node: Nondecimal-numbers373157
+Node: Regexp Constants376278
+Node: Using Constant Regexps376824
+Node: Standard Regexp Constants377470
+Node: Strong Regexp Constants380770
+Node: Variables384621
+Node: Using Variables385286
+Node: Assignment Options387266
+Node: Conversion389828
+Node: Strings And Numbers390360
+Ref: Strings And Numbers-Footnote-1393579
+Node: Locale influences conversions393688
+Ref: table-locale-affects396538
+Node: All Operators397181
+Node: Arithmetic Ops397822
+Node: Concatenation400652
+Ref: Concatenation-Footnote-1403602
+Node: Assignment Ops403725
+Ref: table-assign-ops408864
+Node: Increment Ops410246
+Node: Truth Values and Conditions413845
+Node: Truth Values414971
+Node: Typing and Comparison416062
+Node: Variable Typing416898
+Ref: Variable Typing-Footnote-1423560
+Ref: Variable Typing-Footnote-2423640
+Node: Comparison Operators423723
+Ref: table-relational-ops424150
+Node: POSIX String Comparison427836
+Ref: POSIX String Comparison-Footnote-1429595
+Ref: POSIX String Comparison-Footnote-2429738
+Node: Boolean Ops429822
+Ref: Boolean Ops-Footnote-1434515
+Node: Conditional Exp434611
+Node: Function Calls436397
+Node: Precedence440347
+Node: Locales444224
+Node: Expressions Summary445906
+Node: Patterns and Actions448569
+Node: Pattern Overview449711
+Node: Regexp Patterns451437
+Node: Expression Patterns451983
+Node: Ranges455892
+Node: BEGIN/END459070
+Node: Using BEGIN/END459883
+Ref: Using BEGIN/END-Footnote-1462793
+Node: I/O And BEGIN/END462903
+Node: BEGINFILE/ENDFILE465384
+Node: Empty468825
+Node: Using Shell Variables469142
+Node: Action Overview471480
+Node: Statements473915
+Node: If Statement475813
+Node: While Statement477382
+Node: Do Statement479470
+Node: For Statement480656
+Node: Switch Statement484013
+Node: Break Statement486564
+Node: Continue Statement488756
+Node: Next Statement490688
+Node: Nextfile Statement493185
+Node: Exit Statement496046
+Node: Built-in Variables498579
+Node: User-modified499758
+Node: Auto-set507969
+Ref: Auto-set-Footnote-1526068
+Ref: Auto-set-Footnote-2526286
+Node: ARGC and ARGV526342
+Node: Pattern Action Summary530781
+Node: Arrays533397
+Node: Array Basics534774
+Node: Array Intro535624
+Ref: figure-array-elements537640
+Ref: Array Intro-Footnote-1540509
+Node: Reference to Elements540641
+Node: Assigning Elements543163
+Node: Array Example543658
+Node: Scanning an Array545627
+Node: Controlling Scanning548724
+Ref: Controlling Scanning-Footnote-1555370
+Node: Numeric Array Subscripts555694
+Node: Uninitialized Subscripts557968
+Node: Delete559647
+Ref: Delete-Footnote-1562461
+Node: Multidimensional562518
+Node: Multiscanning565723
+Node: Arrays of Arrays567395
+Node: Arrays Summary572295
+Node: Functions574484
+Node: Built-in575544
+Node: Calling Built-in576733
+Node: Boolean Functions578780
+Node: Numeric Functions579350
+Ref: Numeric Functions-Footnote-1583543
+Ref: Numeric Functions-Footnote-2584227
+Ref: Numeric Functions-Footnote-3584279
+Node: String Functions584555
+Ref: String Functions-Footnote-1610786
+Ref: String Functions-Footnote-2610920
+Ref: String Functions-Footnote-3611180
+Node: Gory Details611267
+Ref: table-sub-escapes613174
+Ref: table-sub-proposed614820
+Ref: table-posix-sub616330
+Ref: table-gensub-escapes618018
+Ref: Gory Details-Footnote-1618952
+Node: I/O Functions619106
+Ref: table-system-return-values625793
+Ref: I/O Functions-Footnote-1627964
+Ref: I/O Functions-Footnote-2628112
+Node: Time Functions628232
+Ref: Time Functions-Footnote-1639388
+Ref: Time Functions-Footnote-2639464
+Ref: Time Functions-Footnote-3639626
+Ref: Time Functions-Footnote-4639737
+Ref: Time Functions-Footnote-5639855
+Ref: Time Functions-Footnote-6640090
+Node: Bitwise Functions640372
+Ref: table-bitwise-ops640974
+Ref: Bitwise Functions-Footnote-1647228
+Ref: Bitwise Functions-Footnote-2647407
+Node: Type Functions647604
+Node: I18N Functions651197
+Node: User-defined652940
+Node: Definition Syntax653760
+Ref: Definition Syntax-Footnote-1659588
+Node: Function Example659665
+Ref: Function Example-Footnote-1662644
+Node: Function Calling662666
+Node: Calling A Function663260
+Node: Variable Scope664230
+Node: Pass By Value/Reference667284
+Node: Function Caveats670016
+Ref: Function Caveats-Footnote-1672111
+Node: Return Statement672235
+Node: Dynamic Typing675290
+Node: Indirect Calls677682
+Node: Functions Summary688841
+Node: Library Functions691618
+Ref: Library Functions-Footnote-1695166
+Ref: Library Functions-Footnote-2695309
+Node: Library Names695484
+Ref: Library Names-Footnote-1699278
+Ref: Library Names-Footnote-2699505
+Node: General Functions699601
+Node: Strtonum Function700795
+Node: Assert Function703877
+Node: Round Function707329
+Node: Cliff Random Function708907
+Node: Ordinal Functions709940
+Ref: Ordinal Functions-Footnote-1713049
+Ref: Ordinal Functions-Footnote-2713301
+Node: Join Function713515
+Ref: Join Function-Footnote-1715318
+Node: Getlocaltime Function715522
+Node: Readfile Function719296
+Node: Shell Quoting721325
+Node: Isnumeric Function722781
+Node: Data File Management724193
+Node: Filetrans Function724825
+Node: Rewind Function729119
+Node: File Checking731098
+Ref: File Checking-Footnote-1732470
+Node: Empty Files732677
+Node: Ignoring Assigns734744
+Node: Getopt Function736318
+Ref: Getopt Function-Footnote-1752152
+Node: Passwd Functions752364
+Ref: Passwd Functions-Footnote-1761546
+Node: Group Functions761634
+Ref: Group Functions-Footnote-1769772
+Node: Walking Arrays769985
+Node: Library Functions Summary773033
+Node: Library Exercises774457
+Node: Sample Programs774944
+Node: Running Examples775726
+Node: Clones776478
+Node: Cut Program777750
+Node: Egrep Program788191
+Node: Id Program797508
+Node: Split Program807622
+Ref: Split Program-Footnote-1817857
+Node: Tee Program818044
+Node: Uniq Program820953
+Node: Wc Program828818
+Node: Bytes vs. Characters829213
+Node: Using extensions830815
+Node: wc program831595
+Node: Miscellaneous Programs836601
+Node: Dupword Program837830
+Node: Alarm Program839893
+Node: Translate Program844806
+Ref: Translate Program-Footnote-1849547
+Node: Labels Program849825
+Ref: Labels Program-Footnote-1853266
+Node: Word Sorting853358
+Node: History Sorting857552
+Node: Extract Program859827
+Node: Simple Sed868096
+Node: Igawk Program871312
+Ref: Igawk Program-Footnote-1886559
+Ref: Igawk Program-Footnote-2886765
+Ref: Igawk Program-Footnote-3886895
+Node: Anagram Program887022
+Node: Signature Program890118
+Node: Programs Summary891370
+Node: Programs Exercises892628
+Ref: Programs Exercises-Footnote-1896944
+Node: Advanced Features897030
+Node: Nondecimal Data899524
+Node: Boolean Typed Values901154
+Node: Array Sorting903129
+Node: Controlling Array Traversal903858
+Ref: Controlling Array Traversal-Footnote-1912365
+Node: Array Sorting Functions912487
+Ref: Array Sorting Functions-Footnote-1918606
+Node: Two-way I/O918814
+Ref: Two-way I/O-Footnote-1926809
+Ref: Two-way I/O-Footnote-2927000
+Node: TCP/IP Networking927082
+Node: Profiling930262
+Node: Persistent Memory939972
+Ref: Persistent Memory-Footnote-1948930
+Node: Extension Philosophy949061
+Node: Advanced Features Summary950596
+Node: Internationalization952866
+Node: I18N and L10N954572
+Node: Explaining gettext955267
+Ref: Explaining gettext-Footnote-1961420
+Ref: Explaining gettext-Footnote-2961615
+Node: Programmer i18n961780
+Ref: Programmer i18n-Footnote-1966893
+Node: Translator i18n966942
+Node: String Extraction967778
+Ref: String Extraction-Footnote-1968956
+Node: Printf Ordering969054
+Ref: Printf Ordering-Footnote-1971916
+Node: I18N Portability971984
+Ref: I18N Portability-Footnote-1974558
+Node: I18N Example974629
+Ref: I18N Example-Footnote-1978029
+Ref: I18N Example-Footnote-2978105
+Node: Gawk I18N978222
+Node: I18N Summary978878
+Node: Debugger980279
+Node: Debugging981303
+Node: Debugging Concepts981752
+Node: Debugging Terms983578
+Node: Awk Debugging986191
+Ref: Awk Debugging-Footnote-1987168
+Node: Sample Debugging Session987308
+Node: Debugger Invocation987860
+Node: Finding The Bug989489
+Node: List of Debugger Commands996175
+Node: Breakpoint Control997552
+Node: Debugger Execution Control1001384
+Node: Viewing And Changing Data1004864
+Node: Execution Stack1008602
+Node: Debugger Info1010283
+Node: Miscellaneous Debugger Commands1014582
+Node: Readline Support1019835
+Node: Limitations1020781
+Node: Debugging Summary1023425
+Node: Namespaces1024728
+Node: Global Namespace1025855
+Node: Qualified Names1027300
+Node: Default Namespace1028335
+Node: Changing The Namespace1029110
+Node: Naming Rules1030804
+Node: Internal Name Management1032719
+Node: Namespace Example1033789
+Node: Namespace And Features1036372
+Node: Namespace Summary1037829
+Node: Arbitrary Precision Arithmetic1039342
+Node: Computer Arithmetic1040861
+Ref: table-numeric-ranges1044678
+Ref: table-floating-point-ranges1045176
+Ref: Computer Arithmetic-Footnote-11045835
+Node: Math Definitions1045894
+Ref: table-ieee-formats1048939
+Node: MPFR features1049513
+Node: MPFR On Parole1049966
+Ref: MPFR On Parole-Footnote-11050810
+Node: MPFR Intro1050969
+Node: FP Math Caution1052659
+Ref: FP Math Caution-Footnote-11053733
+Node: Inexactness of computations1054110
+Node: Inexact representation1055141
+Node: Comparing FP Values1056524
+Node: Errors accumulate1057782
+Node: Strange values1059249
+Ref: Strange values-Footnote-11061915
+Node: Getting Accuracy1062020
+Node: Try To Round1064757
+Node: Setting precision1065664
+Ref: table-predefined-precision-strings1066369
+Node: Setting the rounding mode1068254
+Ref: table-gawk-rounding-modes1068636
+Ref: Setting the rounding mode-Footnote-11072694
+Node: Arbitrary Precision Integers1072877
+Ref: Arbitrary Precision Integers-Footnote-11076089
+Node: Checking for MPFR1076245
+Node: POSIX Floating Point Problems1077735
+Ref: POSIX Floating Point Problems-Footnote-11082599
+Node: Floating point summary1082637
+Node: Dynamic Extensions1084901
+Node: Extension Intro1086500
+Node: Plugin License1087808
+Node: Extension Mechanism Outline1088621
+Ref: figure-load-extension1089072
+Ref: figure-register-new-function1090657
+Ref: figure-call-new-function1091767
+Node: Extension API Description1093891
+Node: Extension API Functions Introduction1095620
+Ref: table-api-std-headers1097518
+Node: General Data Types1101982
+Ref: General Data Types-Footnote-11111150
+Node: Memory Allocation Functions1111465
+Ref: Memory Allocation Functions-Footnote-11116190
+Node: Constructor Functions1116289
+Node: API Ownership of MPFR and GMP Values1120194
+Node: Registration Functions1121755
+Node: Extension Functions1122459
+Node: Exit Callback Functions1128035
+Node: Extension Version String1129354
+Node: Input Parsers1130049
+Node: Output Wrappers1144693
+Node: Two-way processors1149541
+Node: Printing Messages1151902
+Ref: Printing Messages-Footnote-11153116
+Node: Updating ERRNO1153271
+Node: Requesting Values1154070
+Ref: table-value-types-returned1154823
+Node: Accessing Parameters1155932
+Node: Symbol Table Access1157216
+Node: Symbol table by name1157732
+Ref: Symbol table by name-Footnote-11160943
+Node: Symbol table by cookie1161075
+Ref: Symbol table by cookie-Footnote-11165356
+Node: Cached values1165420
+Ref: Cached values-Footnote-11169064
+Node: Array Manipulation1169221
+Ref: Array Manipulation-Footnote-11170324
+Node: Array Data Types1170361
+Ref: Array Data Types-Footnote-11173183
+Node: Array Functions1173283
+Node: Flattening Arrays1178312
+Node: Creating Arrays1185364
+Node: Redirection API1190214
+Node: Extension API Variables1193235
+Node: Extension Versioning1193960
+Ref: gawk-api-version1194397
+Node: Extension GMP/MPFR Versioning1196185
+Node: Extension API Informational Variables1197891
+Node: Extension API Boilerplate1199052
+Node: Changes from API V11203188
+Node: Finding Extensions1204822
+Node: Extension Example1205397
+Node: Internal File Description1206221
+Node: Internal File Ops1210545
+Ref: Internal File Ops-Footnote-11222103
+Node: Using Internal File Ops1222251
+Ref: Using Internal File Ops-Footnote-11224682
+Node: Extension Samples1224960
+Node: Extension Sample File Functions1226529
+Node: Extension Sample Fnmatch1234667
+Node: Extension Sample Fork1236262
+Node: Extension Sample Inplace1237538
+Node: Extension Sample Ord1241210
+Node: Extension Sample Readdir1242086
+Ref: table-readdir-file-types1242983
+Node: Extension Sample Revout1244121
+Node: Extension Sample Rev2way1244718
+Node: Extension Sample Read write array1245470
+Node: Extension Sample Readfile1248744
+Node: Extension Sample Time1249875
+Node: Extension Sample API Tests1252165
+Node: gawkextlib1252673
+Node: Extension summary1255709
+Node: Extension Exercises1259567
+Node: Language History1260845
+Node: V7/SVR3.11262559
+Node: SVR41264909
+Node: POSIX1266441
+Node: BTL1267866
+Node: POSIX/GNU1268635
+Node: Feature History1275166
+Node: Common Extensions1294733
+Node: Ranges and Locales1296210
+Ref: Ranges and Locales-Footnote-11301011
+Ref: Ranges and Locales-Footnote-21301038
+Ref: Ranges and Locales-Footnote-31301277
+Node: Contributors1301500
+Node: History summary1307705
+Node: Installation1309151
+Node: Gawk Distribution1310115
+Node: Getting1310607
+Node: Extracting1311606
+Node: Distribution contents1313318
+Node: Unix Installation1321398
+Node: Quick Installation1322220
+Node: Compiling with MPFR1324766
+Node: Shell Startup Files1325472
+Node: Additional Configuration Options1326629
+Node: Configuration Philosophy1329016
+Node: Compiling from Git1331518
+Node: Building the Documentation1332077
+Node: Non-Unix Installation1333489
+Node: PC Installation1333965
+Node: PC Binary Installation1334838
+Node: PC Compiling1335743
+Node: PC Using1336921
+Node: Cygwin1340649
+Node: MSYS1341905
+Node: OpenVMS Installation1342537
+Node: OpenVMS Compilation1343218
+Ref: OpenVMS Compilation-Footnote-11344701
+Node: OpenVMS Dynamic Extensions1344763
+Node: OpenVMS Installation Details1346399
+Node: OpenVMS Running1348834
+Node: OpenVMS GNV1352971
+Node: Bugs1353726
+Node: Bug definition1354650
+Node: Bug address1358301
+Node: Usenet1361892
+Node: Performance bugs1363123
+Node: Asking for help1366141
+Node: Maintainers1368132
+Node: Other Versions1369159
+Node: Installation summary1378091
+Node: Notes1379475
+Node: Compatibility Mode1380285
+Node: Additions1381107
+Node: Accessing The Source1382052
+Node: Adding Code1383587
+Node: New Ports1390723
+Node: Derived Files1395233
+Ref: Derived Files-Footnote-11401080
+Ref: Derived Files-Footnote-21401115
+Ref: Derived Files-Footnote-31401732
+Node: Future Extensions1401846
+Node: Implementation Limitations1402518
+Node: Extension Design1403760
+Node: Old Extension Problems1404924
+Ref: Old Extension Problems-Footnote-11406500
+Node: Extension New Mechanism Goals1406561
+Ref: Extension New Mechanism Goals-Footnote-11410057
+Node: Extension Other Design Decisions1410258
+Node: Extension Future Growth1412457
+Node: Notes summary1413081
+Node: Basic Concepts1414294
+Node: Basic High Level1414979
+Ref: figure-general-flow1415261
+Ref: figure-process-flow1415968
+Ref: Basic High Level-Footnote-11419369
+Node: Basic Data Typing1419558
+Node: Glossary1422976
+Node: Copying1456098
+Node: GNU Free Documentation License1493859
+Node: Index1519182

End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 54ac53b3..55c99d2e 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -8162,7 +8162,8 @@ Many CSV files are imported from systems where the line terminator
for text files is a carriage-return--line-feed pair
(CR-LF, @samp{\r} followed by @samp{\n}).
For ease of use, when processing CSV files, @command{gawk} simply
-strips out any carriage-return characters in the input.
+includes the carriage-return character in the record terminator
+when it occurs immediately prior to a line-feed character in the input.
@docbook
</sidebar>
@@ -8183,7 +8184,8 @@ Many CSV files are imported from systems where the line terminator
for text files is a carriage-return--line-feed pair
(CR-LF, @samp{\r} followed by @samp{\n}).
For ease of use, when processing CSV files, @command{gawk} simply
-strips out any carriage-return characters in the input.
+includes the carriage-return character in the record terminator
+when it occurs immediately prior to a line-feed character in the input.
@end cartouche
@end ifnotdocbook
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 01c37578..44a7d551 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -7718,7 +7718,8 @@ Many CSV files are imported from systems where the line terminator
for text files is a carriage-return--line-feed pair
(CR-LF, @samp{\r} followed by @samp{\n}).
For ease of use, when processing CSV files, @command{gawk} simply
-strips out any carriage-return characters in the input.
+includes the carriage-return character in the record terminator
+when it occurs immediately prior to a line-feed character in the input.
@end sidebar
The behavior of the @code{split()} function (not formally discussed
diff --git a/io.c b/io.c
index d2e5d2b4..4f230ea2 100644
--- a/io.c
+++ b/io.c
@@ -3855,14 +3855,6 @@ csvscan(IOBUF *iop, struct recmatch *recm, SCANSTATE *state)
while (*bp != rs) {
if (*bp == '\"')
in_quote = ! in_quote;
- else if (*bp == '\r') { // strip CRs
- size_t count = (iop->dataend - bp);
-
- // shift it all down by one
- memmove(bp, bp + 1, count);
- iop->dataend--;
- bp--; // compensate for the upcoming bp++
- }
bp++;
}
} while (in_quote && bp < iop->dataend && bp++);
@@ -3871,8 +3863,15 @@ csvscan(IOBUF *iop, struct recmatch *recm, SCANSTATE *state)
recm->len = bp - recm->start;
if (bp < iop->dataend) { /* found it in the buffer */
- recm->rt_start = bp;
- recm->rt_len = 1;
+ if (bp > iop->off && bp[-1] == '\r') {
+ /* handle CR LF conventional CSV record terminator */
+ recm->rt_start = bp - 1;
+ recm->rt_len = 2;
+ }
+ else {
+ recm->rt_start = bp;
+ recm->rt_len = 1;
+ }
*state = NOSTATE;
return REC_OK;
} else {