GitHub - jftuga/chars: Determine the end-of-line format, tabs, bom, and nul characters
Extracto
Determine the end-of-line format, tabs, bom, and nul characters - jftuga/chars
Contenido
chars
Determine the end-of-line format, tabs, bom, and nul characters
Usage
- For help, run 
chars -h 
chars v2.4.0
Determine the end-of-line format, tabs, bom, and nul
https://github.com/jftuga/chars
Usage:
chars [filename or file-glob 1] [filename or file-glob 2] ...
  -F	when used with -f, only display a list of failed files, one per line
  -b	examine binary files
  -c	add comma thousands separator to numeric values
  -e string
        exclude based on regular expression; use .* instead of *
  -f string
        fail with OS exit code=100 if any of the included characters exist; ex: -f crlf,nul,bom8
  -j	output results in JSON format; can't be used with -l; does not honor -t or -c
  -l int
        shorten files names to a maximum of this length
        shorten files names to a maximum of this length
  -t	append a row which includes a total for each column
  -v	display version and then exit
Notes:
Use - to read a file from STDIN
On Windows, try: chars *  -or-  chars */*  -or-  chars */*/*
Installation
- macOS: 
brew update; brew install jftuga/tap/chars - Binaries for Linux, macOS and Windows are provided in the releases section.
 
Example 1
- Run 
charswith no additional cmd-line switches - 
- Only report files in the current directory
 
 - 
- Report text files only since 
-bis not used 
 - Report text files only since 
 
PS C:\chars> .\chars.exe *
+-----------------+------+-----+-----+------+------+-------+-----------+
|    FILENAME     | CRLF | LF  | TAB | NUL  | BOM8 | BOM16 | BYTESREAD |
+-----------------+------+-----+-----+------+------+-------+-----------+
| .goreleaser.yml |    0 |  59 |   0 |    0 |    0 |     0 |      1066 |
| LICENSE         |    0 |  21 |   0 |    0 |    0 |     0 |      1068 |
| README.md       |    0 |  92 |   0 |    0 |    0 |     0 |      3510 |
| chars.go        |    0 | 246 | 328 |    0 |    0 |     0 |      6477 |
| go.mod          |    0 |  10 |   2 |    0 |    0 |     0 |       188 |
| go.sum          |    0 |   6 |   0 |    0 |    0 |     0 |       533 |
| testfile1       |    0 |  22 |   0 | 3223 |    0 |     1 |      6448 |
+-----------------+------+-----+-----+------+------+-------+-----------+
Example 2
- Run 
charswith-eand-lcmd-line switches - 
- Only report files starting with 
pin theC:\Windows\System32directory 
 - Only report files starting with 
 - 
- Exclude all files matching 
perf.*dat 
 - Exclude all files matching 
 - 
- Shorten filenames to a maximum length of 
32 
 - Shorten filenames to a maximum length of 
 
PS C:\chars> .\chars.exe -e perf.*dat -l 32 C:\Windows\System32\p*
+----------------------------------+------+----+-----+------+------+-------+-----------+
|             FILENAME             | CRLF | LF | TAB | NUL  | BOM8 | BOM16 | BYTESREAD |
+----------------------------------+------+----+-----+------+------+-------+-----------+
| C:\Windows\System32\pcl.sep      |   11 |  0 |   0 |    0 |    0 |     0 |       150 |
| C:\Windows\System32\perfmon.msc  | 1933 |  0 |   0 |    0 |    0 |     0 |    145519 |
| C:\Windows\Sys...tmanagement.msc | 1945 |  0 |   0 |    0 |    0 |     0 |    146389 |
| C:\Windows\System32\pscript.sep  |    2 |  0 |   0 |    0 |    0 |     0 |        51 |
| C:\Windows\Sys...eryprovider.mof |    0 | 61 |   0 | 2073 |    0 |     1 |      4148 |
+----------------------------------+------+----+-----+------+------+-------+-----------+
Example 3
- Pipe STDIN to 
chars - Use JSON output, with 
-j 
$ curl -s https://example.com/ | chars -j[
    {
        "filename": "STDIN",
        "crlf": 0,
        "lf": 46,
        "tab": 0,
        "bom8": 0,
        "bom16": 0,
        "nul": 0,
        "bytesRead": 1256
    }
]Example 4
- Fail when certain characters are detected, with 
-f - 
- OS exit code on a 
-ffailure is always100 
 - OS exit code on a 
 - 
-fis a comma-delimited list containing:crlf,lf,tab,nul,bom8,bom16
 
$ chars -f lf,tab /etc/group ; echo $? +------------+------+----+-----+-----+------+-------+-----------+ | FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | BYTESREAD | +------------+------+----+-----+-----+------+-------+-----------+ | /etc/group | 0 | 58 | 0 | 0 | 0 | 0 | 795 | +------------+------+----+-----+-----+------+-------+-----------+ 100
Example 5
- Fail when certain characters are detected, with 
-f - Only output failed file names, with 
-F 
$ chars -f lf,tab -F /etc/gr* ; echo $? /etc/group /etc/group.bak 100
Example 6
- Output to JSON, with 
-j - Use 
-eto exclude and filenames starting withgo, such asgo.modandgo.sum - Use 
jqto output toCSVcontaining two columns:filename,tab - 
- Only include files that contain 
tabcharacters 
 - Only include files that contain 
 
$ chars -e '^go' -j * | jq -r '.[] | select(.tab > 0) | [.filename,.tab] | @csv' "case.go",80 "chars.go",475
Example 7
- Output totals, with 
-t - Output commas in numeric values, with 
-c - Exclude files containing 
.g*, with-e 
PS C:\chars> .\chars.exe -t -c -e "\.g.*" *
+-----------------+------+-----+-----+-----+------+-------+-----------+
|    FILENAME     | CRLF | LF  | TAB | NUL | BOM8 | BOM16 | BYTESREAD |
+-----------------+------+-----+-----+-----+------+-------+-----------+
| LICENSE         |    0 |  21 |   0 |   0 |    0 |     0 |     1,068 |
| README.md       |    0 | 178 |   4 |   0 |    0 |     0 |     6,656 |
| STATUS.md       |    0 |  50 |   0 |   0 |    0 |     0 |     3,055 |
| go.mod          |    0 |  11 |   3 |   0 |    0 |     0 |       214 |
| go.sum          |    0 |   9 |   0 |   0 |    0 |     0 |       795 |
| TOTALS: 5 files |    0 | 269 |   7 |   0 |    0 |     0 |    11,788 |
+-----------------+------+-----+-----+-----+------+-------+-----------+
Reading from STDIN on Windows
- YMMV when piping to 
STDINunder Windows - 
- Under 
cmd, instead oftype input.txt | chars, use<redirection when possible:chars < input.txt 
 - Under 
 - 
- Under a recent version of 
powershell, useGet-Content -AsByteStream input.txt | charsinstead of justGet-Content input.txt | chars 
 - Under a recent version of 
 cmdandpowershellwill skipBOMcharacters; these 2 fields will both report a value of0cmdandpowershellwill skipNULcharacters; this field report a value of0cmdwill convertLFtoCRLFforUTF-16encoded filespowershellwill convertLFtoCRLF- Piping from programs such as 
curlwill returnLFcharacters undercmd, butCRLFunderpowershell - 
- Under powershell, consider using 
curl --output 
 - Under powershell, consider using 
 
Case Folding on Windows
- Case folding on Windows is somewhat implemented in case.go.
 - 
- This programs attempts case-insensitive filename matching since this is the expected behavior on Windows.
 
 - 
- It is hard-coded to 
English. 
 - It is hard-coded to 
 
Wikipedia
- Newline - 
CRLFvsLF - Tab key
 - Null character
 - Byte order mark - 
BOM-8vsBOM-16 
Acknowledgments
- ellipsis - Go module to insert an ellipsis into the middle of a long string to shorten it
 - tablewriter - ASCII table in golang
 - /u/skeeto and /u/petreus provided code review and suggestions
 
Fuente: GitHub