Saturday, November 06, 2021

UTF8 content in Git bare clone encoding 改編碼看UTF8文件

Recently found a git repo of 13,3xx ancient chinese books in utf8 cn. // 萬多本中文古書

(可用chrome 看,可轉為繁體字;或者下載做電字書 用 calibre  做epub 放ebook reader 看  :D )


The repo just too big, multiple giga bytes.

Either download the zip file of the repo (if not care about future updates), or, can do a *minimum* clone:


$ git  clone  --bare  --depth 1  git://....


# list all files in a bare repo

$ git ls-tree --full-tree -r --name-only HEAD

# -r    recursive


By default, git will print non-ASCII file names in quoted octal notation, i.e. "\nnn\nnn...".

 # show utf8 characters    (ref:  https://stackoverflow.com/questions/22827239/how-to-make-git-properly-display-utf-8-encoded-pathnames-in-the-console-window)

$ git config core.quotepath off


list file again, now show utf8 okay


# to view one file content, say, on master branch

$ git show master:/path/to/file


optionally convert to zh_TW 繁體

$ git show master:/path/to/file |  cconv -f utf8-cn -t utf8-tw



Labels: , , ,