Oleksandr Gavenko's blog
2017-02-23 23:10 Find file line ending inconsistency in your project

If project is shared by team members with different OS it is possible that your files have mix of eol (end of line) styles.

Usual solution is to maintain per project .hgeol or .gitattributes.

But before intruduction of these files you may want to figure out what line ending is used across the project files.

Not so long ago dos2unix project introduce options to report file endings.

From /usr/share/doc/dos2unix/ChangeLog.txt:

2014-09-11 Erwin Waterlander <waterlan@xs4all.nl>
    * common.c: New option -i, --info, display file information.
      This new option prints number of line breaks, the byte order
      mark, and if the file is text or binary.
    * man/man1/dos2unix.pod: Added option -i.

To find files ending on LF only:

$ find . -type f | xargs unix2dos -ic

To find files ending on CR/LF only:

$ find . -type f | xargs dos2unix -ic

-i option prevent utilities from modifying files. It has long form alternatives --info:

$ doc2unix --info *.txt
   6       0       0  no_bom    text    dos.txt
   0       6       0  no_bom    text    unix.txt
   0       0       6  no_bom    text    mac.txt
   6       6       6  no_bom    text    mixed.txt
  50       0       0  UTF-16LE  text    utf16le.txt
   0      50       0  no_bom    text    utf8unix.txt
  50       0       0  UTF-8     text    utf8dos.txt
   2     418     219  no_bom    binary  dos2unix.exe

To find mixed files we can use full report, CR amount should be zero or equal to LF amount:

$ find . -type f | xargs dos2unix -idu \
  | awk '{if ($1 != 0 && $1 != $2) { \
      printf "dos: %s, unix: %s, file: %s", $1, $2, $3; }}'
git, hg, utils, vcs


all / emacs / java / python


admin(1), anki(1), blog(1), css(2), cygwin(2), emacs(3), fs(1), git(1), hg(2), html(1), interview(11), java(1), js(3), lighttpd(1), mobile(1), naming(1), printer(1), problem(5), quiz(6), rst(1), security(1), sql(1), srs(1), unit(1), utils(1), vcs(1), web(2), win(2)