如何使用bash一個大文件分割成許多小文件？

mingjiu 2016-10-15

展開全文

如何使用bash一個大文件分割成許多小文件？

bash
我有一個文件，說all，有2000行，我希望它可以被分成4個小文件，行號1?500，501?1000，1001?1500 CodeGo.net，1501?2000。或許，我可以做

cat all | head -500 >small1
cat all | tail -1500 | head -500 >small2
cat all | tail -1000 | head -500 >small3
cat all | tail -500 >small4

但這種方式涉及到計算行數(shù)，這可能錯誤，當(dāng)行數(shù)是不是一個好數(shù)字，或者當(dāng)我們要分割的文件太多的小文件（例如：文件all與3241線，我們希望把它分割成7個文件，每個都有463線）。有沒有更好的辦法做到這一點？
本文地址：CodeGo.net/2693017/
-------------------------------------------------------------------------------------------------------------------------
1.如果要分割的文件，使用split：

split -l 500 all all

將文件拆分成每個具有500線的幾個文件。如果您想將文件分割成4個文件差不多大小的，用這樣的：

split -l $(( $( wc -l < all ) / 4 + 1 )) all all

2. 直視split命令，它應(yīng)該做你想做的（及以上）：

$ split --help
Usage: split [OPTION]... [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT
is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
 -a, --suffix-length=N generate suffixes of length N (default 2)
  --additional-suffix=SUFFIX append an additional SUFFIX to file names.
 -b, --bytes=SIZE  put SIZE bytes per output file
 -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
 -d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic.
         FROM changes the start value (default 0).
 -e, --elide-empty-files do not generate empty output files with '-n'
  --filter=COMMAND write to shell COMMAND; file name is $FILE
 -l, --lines=NUMBER  put NUMBER lines per output file
 -n, --number=CHUNKS  generate CHUNKS output files. See below
 -u, --unbuffered  immediately copy input to output with '-n r/...'
  --verbose   print a diagnostic just before each
       output file is opened
  --help  display this help and exit
  --version output version information and exit
SIZE is an integer and optional unit (example: 10M is 10*1024*1024). Units
are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000).
CHUNKS may be:
N  split into N files based on size of input
K/N  output Kth of N to stdout
l/N  split into N files without splitting lines
l/K/N output Kth of N to stdout without splitting lines
r/N  like 'l' but use round robin distribution
r/K/N likewise but only output Kth of N to stdout

3. 像其他人有你split。所接受的命令替換是沒有必要的。僅供參考，我加入了幾乎什么一直請求。注意-n命令行來指定夾頭，該數(shù)small*文件不包含正好500線split。

$ seq 2000 > all
$ split -n l/4 --numeric-suffixes=1 --suffix-length=1 all small
$ wc -l small*
 583 small1
 528 small2
 445 small3
 444 small4
2000 total

另外，您也GNU并行：

$ < all parallel -N500 --pipe --cat cp {} small{#}
$ wc -l small*
 500 small1
 500 small2
 500 small3
 500 small4
2000 total

正如你所看到的，這個咒語是GNU的并行實際上是most-的并行pipeline。恕我直言一款值得工具尋找到。
本文標(biāo)題：如何使用bash一個大文件分割成許多小文件？
本文地址：CodeGo.net/2693017/

本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點擊一鍵舉報。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： mingjiu > 《bash教程》

舉報/認(rèn)領(lǐng)

0條評論

發(fā)表

請遵守用戶評論公約

類似文章 更多

mingjiu

關(guān)注對話

TA的最新館藏

[轉(zhuǎn)] 中科院推出的三大免費(fèi)神器，值得焊死在你的電腦上！
[轉(zhuǎn)] 通俗易懂，泄露天機(jī)最多的10本丹經(jīng)道書，道家打坐修行靜心書籍
皇象急就章前頭一小點
[轉(zhuǎn)] 三國.吳_皇象_急就章
[轉(zhuǎn)] 皇象《急就章》高清圖，太養(yǎng)眼了！|楷書|章草|急就章|博物館|原石
[轉(zhuǎn)] 王羲之為王獻(xiàn)之留下一本秘笈，草書要入門，苦練這一本就夠了！

喜歡該文的人也喜歡更多

熱門閱讀換一換

小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

如何使用bash一個大文件分割成許多小文件？

如何使用bash一個大文件分割成許多小文件？

如何使用bash一個大文件分割成許多小文件？