Sunday, October 31, 2021

Manjaro linux: syntax/source highlighing/coloring in vim/bash terminal

if you are a text tool user like me, syntax coloring or source code highlighting can be helpful.


# 1. use GUI add/remove software to install 2 packages:


    syntax-highlighting

    source-highlight


# 2. edit  ~/.bashrc  and add below  

# (if other linux distro, use other path of src-hilite-lesspipe.sh)


export LESSOPEN="| /usr/bin/src-hilite-lesspipe.sh %s"

export LESS=" -R "

alias less='less -m -N -g -i -J --underline-special --SILENT'


# 3. edit  ~/.vimrc  and add


if &t_Co > 2 || has("gui_running")

    syntax on

    set hlsearch

endif



* * * 

# to test

open a terminal


$ ls 

$ less test.cpp

$ less test.py

$ vim test.py




Labels: , , ,

Monday, October 11, 2021

中文檔在bash出亂碼,可能unicode須轉成utf-8

 mac osx 與 linux bash 普遍支援 UTF-8 編碼,為 1~4 個bytes不定長度(variable-length) characters。例如ascii 是 1byte,中文字兩bytes。

unicode 係 定長編碼 (fixed-length),unicode 文字檔一般在檔頭加上 Byte order mark (BOM, 字節順序標記)。這種檔在GUI 文件編輯器開無問題,但在 bash 會出亂碼,不能用cat, sed 等命令。

例如有字幕ssa檔

00000000: fffe 5b00 5300 6300 7200 6900 7000 7400  ..[.S.c.r.i.p.t.

00000010: 2000 4900 6e00 6600 6f00 5d00 0d00 0a00   .I.n.f.o.].....

檔頭32bytes 為BOM,標記含義可參考 https://zh.wikipedia.org/wiki/位元組順序記號


解決辦法 -- 強大轉碼(convert text encoding)工具:

$ iconv -f unicode -t utf-8 in.txt  > out.txt


* * * 

順帶一提,如果要刪除每行末 ^M ,可用 sed 

s/^M//g

注意^M 是一個 character,輸入方法是 ctrl-v ctrl-m。

Labels: , , , , , ,

Sunday, October 03, 2021

Manjaro 安裝 中文輸入法 倉頡

Saturday, October 02, 2021

external USB drive choosing File system

File System <--> Operating System   built-in behaviour

FS                Windows        MacOS        Linux

FAT32               RW              RW               RW            # file sizes < 4GB, plain FS no permission, etc.

NTFS                RW                R                 RW            # proprietary FS

exFAT               RW              RW                RW           # proprietary FS, bigger sector size than FAT32

APFS                  -                 RW                R?            # by Mac

ext4                     -                 F*               RW

samba                RW              RW               RW            # or other network services

--------------------------------------------------------

*F - Forgot, not care.   :D


Encrypted drives by Linux or MacOS, another story, of course.

NTFS is Microsoft's proprietary file system.  If errors, those partitions cannot be repaired by Linux nor MacOS.  In Windows, try right-click the drive icon, properties > tools > check disk.  DO NOT save browser pages (html and _files might have invalid file names) from non-windows OS to NTFS!

exFAT is widely accepted.  It can handle larger file size over 4GB such as movies, with the tradeoff having a bit bigger overhead.  Otherwise, FAT32 is still a good choice works most of the time.

One thing to keep in mind for FAT32 is not supporting user permissions.  For externally backing up most files is not a problem, until you want to clone a git repo - all file permissions mess up, the git status will tell you all files modified, to 777.

For SSD-like such as micro SD cards,  use FAT32 up to 32GB;  exFAT for bigger capacity.  Quite standard.


Labels: , ,