TIMIT數(shù)據(jù)庫介紹:
TIMIT數(shù)據(jù)庫由630個話者組成,每個人講10句,美式英語的8種主要方言。 TIMIT S5實例: 首先,將TIMIT.ISO中的TIMIT復(fù)制到主文件夾。 1.進(jìn)入對應(yīng)的目錄,進(jìn)行如下操作: zhangju@ubuntu :~$ cd kaldi-trunk/egs/timit/s5/ zhangju@ubuntu :~/kaldi-trunk/egs/timit/s5$ sudo local/timit_data_prep.sh /home/zhangju/TIMIT 會看到如下顯示: Creating coretest set. MDAB0 MWBT0 FELC0 MTAS1 MWEW0 FPAS0 MJMP0 MLNT0 FPKT0 MLLL0 MTLS0 FJLM0 MBPM0 MKLT0 FNLP0 MCMJ0 MJDH0 FMGD0 MGRT0 MNJM0 FDHC0 MJLN0 MPAM0 FMLD0 # of utterances in coretest set = 192 Creating dev set. FAKS0 FDAC1 FJEM0 MGWT0 MJAR0 MMDB1 MMDM2 MPDF0 FCMH0 FKMS0 MBDG0 MBWM0 MCSH0 FADG0 FDMS0 FEDW0 MGJF0 MGLB0 MRTK0 MTAA0 MTDT0 MTHC0 MWJG0 FNMR0 FREW0 FSEM0 MBNS0 MMJR0 MDLS0 MDLF0 MDVC0 MERS0 FMAH0 FDRW0 MRCS0 MRJM4 FCAL1 MMWH0 FJSJ0 MAJC0 MJSW0 MREB0 FGJD0 FJMG0 MROA0 MTEB0 MJFC0 MRJR0 FMML0 MRWS1 # of utterances in dev set = 400 Finalizing test Finalizing dev timit_data_prep succeeded. 于是在/home/zhangju/kaldi-trunk/egs/timit/s5文件夾下新生成data文件夾,其內(nèi)包含local文件夾以及相關(guān)內(nèi)容。 2 在終端輸入: local/timit_train_lms.sh data/local(下載、計算文本,用以建立語言模型) local/timit_format_data.sh(處理與fst有關(guān)的東西) 3創(chuàng)建train的mfcc: sudo steps/make_mfcc.sh data/train exp/make_mfcc/train mfccs 4 (要對train,dev,test創(chuàng)建) 會看到: Succeeded creating MFCC features for train sudo steps/make_mfcc.sh data/test exp/make_mfcc/test mfccs 4 會看到: Succeeded creating MFCC features for test sudo steps/make_mfcc.sh data/dev exp/make_mfcc/dev mfccs 4 會看到: Succeeded creating MFCC features for dev 4訓(xùn)練單音素系統(tǒng)(monophone systom) sudo steps/train_mono.sh data/train data/lang exp/mono 會顯示: Computing cepstral mean and variance statistics Initializing monophone system. Compiling training graphs Pass 0 Pass 1 Aligning data Pass 2 Aligning data Pass 3 Aligning data Pass 4 Aligning data Pass 5 Aligning data Pass 6 Aligning data Pass 7 Aligning data Pass 8 Aligning data Pass 9 Aligning data Pass 10 Aligning data Pass 11 Pass 12 Aligning data Pass 13 Pass 14 Pass 15 Aligning data Pass 16 Pass 17 Pass 18 Pass 19 Pass 20 Aligning data Pass 21 Pass 22 Pass 23 Pass 24 Pass 25 Aligning data Pass 26 Pass 27 Pass 28 Pass 29 于是,新建了exp/mono文件夾 scripts/mkgraph.sh --mono data/lang exp/mono exp/mono/graph(制圖) 會顯示: fsttablecompose data/lang/L.fst data/lang/G.fst fstdeterminizestar --use-log=true fstminimizeencoded fstisstochastic data/lang/tmp/LG.fst -0.000244359 -0.0912761 warning: LG not stochastic. fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang/tmp/disambig_phones.list --write-disambig-syms=data/lang/tmp/disambig_ilabels_1_0.list data/lang/tmp/ilabels_1_0 fstisstochastic data/lang/tmp/CLG_1_0.fst -0.000244359 -0.0912761 warning: CLG not stochastic. make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.list --transition-scale=1.0 data/lang/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl fstminimizeencoded fstdeterminizestar --use-log=true fsttablecompose exp/mono/graph/Ha.fst data/lang/tmp/CLG_1_0.fst fstrmsymbols exp/mono/graph/disambig_tid.list fstrmepslocal fstisstochastic exp/mono/graph/HCLGa.fst 0.000331581 -0.091291 HCLGa is not stochastic add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl 5.for test in dev test ; do steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test & done(解碼test數(shù)據(jù)集(test是*/s5/data中dev、test文件夾中的test文件夾)) 終端輸出結(jié)果是:[1] 2307 [2] 2308 6.scripts/average_wer.sh exp/mono/decode_*/wer > exp/mono/wer 會顯示: [1]- 完成 steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test [2]+ 完成 steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test 7從單音素系統(tǒng)中獲得alignments:(分別從mono文件夾中的train,dev,test中獲得)(用以訓(xùn)練其他系統(tǒng)) steps/align_deltas.sh data/train data/lang exp/mono exp/mono_ali_train 會顯示: Computing cepstral mean and variance statistics Aligning all training data Done. 方法二:修改run.sh中的timit路徑,但后直接運行run.sh TIMIT S3實例 1 數(shù)據(jù)準(zhǔn)備,輸入: local/timit_data_prep.sh /home/zhangju/TIMIT 終端顯示: Creating coretest set. MDAB0 MWBT0 FELC0 MTAS1 MWEW0 FPAS0 MJMP0 MLNT0 FPKT0 MLLL0 MTLS0 FJLM0 MBPM0 MKLT0 FNLP0 MCMJ0 MJDH0 FMGD0 MGRT0 MNJM0 FDHC0 MJLN0 MPAM0 FMLD0 (這是說話人的名字,前面加M,F(xiàn)分別表示男性和女性) # of utterances in coretest set = 192 (核心測試集中有192句話) Creating dev set. FAKS0 FDAC1 FJEM0 MGWT0 MJAR0 MMDB1 MMDM2 MPDF0 FCMH0 FKMS0 MBDG0 MBWM0 MCSH0 FADG0 FDMS0 FEDW0 MGJF0 MGLB0 MRTK0 MTAA0 MTDT0 MTHC0 MWJG0 FNMR0 FREW0 FSEM0 MBNS0 MMJR0 MDLS0 MDLF0 MDVC0 MERS0 FMAH0 FDRW0 MRCS0 MRJM4 FCAL1 MMWH0 FJSJ0 MAJC0 MJSW0 MREB0 FGJD0 FJMG0 MROA0 MTEB0 MJFC0 MRJR0 FMML0 MRWS1 # of utterances in dev set = 400 (設(shè)備集中有400句話) Finalizing test (完成test) Finalizing dev (完成dev) timit_data_prep succeeded. 輸入: local/timit_train_lms.sh data/local 終端顯示為 Not installing the kaldi_lm toolkit since it is already there. (kaldi_lm工具箱里有: compute_perplexity計算復(fù)雜度(用于對語言模型作評估,復(fù)雜度越低越好) discount_ngrams給n階語法模型作平滑處理(留出頻率給實際會出現(xiàn)的但ngram中沒出現(xiàn)的詞語組合) get_raw_ngrams(得到原始n階語法模型) get_word_map.pl*(得到詞語的映射表) interpolate_ngrams(補(bǔ)充(修改)n階語法模型) finalize_arpa.pl(完成arpa(arpa是一種格式,協(xié)議),是interpolate_ngrams程序中調(diào)用的) map_words_in_arpa.pl(得到arpa格式的詞語) merge_ngrams(合并、融合n階語法模型) merge_ngrams_online(在線合并、融合n階語法模型) optimize_alpha.pl(使alpha最優(yōu)化) prune_lm.sh(刪去出現(xiàn)頻率較低的數(shù)據(jù)) prune_ngrams(刪去出現(xiàn)頻率較低的數(shù)據(jù)) scale_configs.pl train_lm.sh(訓(xùn)練語言模型) uniq_to_ngrams ) Creating phones file, and monophone lexicon (mapping phones to itself). (創(chuàng)建音子文件及單音素詞典) Creating biphone model(創(chuàng)建雙音子模型) Training biphone language model in folder data/local/lm (訓(xùn)練雙音子語言模型) Creating directory data/local/lm/biphone (創(chuàng)建目錄data/local/lm/biphone ) Getting raw N-gram counts () Iteration 1/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.900000 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 2, D=0.600000, tau=0.900000 phi=2.000000 discount_ngrams: for n-gram order 3, D=0.800000, tau=1.100000 phi=2.000000 discount_ngrams: for n-gram order 1, D=0.400000, tau=0.675000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.675000 phi=2.000000 discount_ngrams: for n-gram order 3, D=0.800000, tau=0.825000 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=1.215000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=1.215000 phi=2.000000 discount_ngrams: for n-gram order 3, D=0.800000, tau=1.485000 phi=2.000000 interpolate_ngrams: 60 words in wordslist Perplexity over 11412.000000 words is 17.013357 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.460842 real 0m0.021s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.016472 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.464985 real 0m0.020s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.021475 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.471402 real 0m0.025s user 0m0.012s sys 0m0.000s optimize_alpha.pl: alpha=-2.1628504673 is too negative, limiting it to -0.5 Projected perplexity change from setting alpha=-0.5 is 17.016472->17.0106241428571, reduction of 0.00584785714286085 Alpha value on iter 1 is -0.5 Iteration 2/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=0.600000, tau=0.550000 phi=2.000000 interpolate_ngrams: 60 words in wordslist interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=0.800000, tau=0.550000 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.080000, tau=0.550000 phi=2.000000 Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.018s user 0m0.004s sys 0m0.008s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.022s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.019s user 0m0.008s sys 0m0.004s optimize_alpha.pl: objective function is not convex; returning alpha=0.7 Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0 Alpha value on iter 2 is 0.7 Iteration 3/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.412500 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.550000 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.742500 phi=2.000000 interpolate_ngrams: 60 words in wordslist Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.020s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.019s user 0m0.008s sys 0m0.004s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.021s user 0m0.012s sys 0m0.000s optimize_alpha.pl: objective function is not convex; returning alpha=0.7 Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0 Alpha value on iter 3 is 0.7 Iteration 4/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=1.750000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.000000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.350000 interpolate_ngrams: 60 words in wordslist Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.018s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.018s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.023s user 0m0.012s sys 0m0.000s optimize_alpha.pl: objective function is not convex; returning alpha=0.7 Projected perplexity change from setting alpha=0.7 is 17.011355->17.011355, reduction of 0 Alpha value on iter 4 is 0.7 Iteration 5/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.450000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.600000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.810000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 Perplexity over 11412.000000 words is 17.008195 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.454326 real 0m0.019s user 0m0.008s sys 0m0.004s Perplexity over 11412.000000 words is 17.011355 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.457880 real 0m0.019s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.018212 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.465417 real 0m0.021s user 0m0.012s sys 0m0.000s optimize_alpha.pl: alpha=-0.670499383475985 is too negative, limiting it to -0.5 Projected perplexity change from setting alpha=-0.5 is 17.011355->17.0064832142857, reduction of 0.00487178571427904 Alpha value on iter 5 is -0.5 Iteration 6/7 of optimizing discounting parameters interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.337500 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 2, D=0.300000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.607500 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 Perplexity over 11412.000000 words is 17.008198 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.454134 real 0m0.019s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.006972 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452861 real 0m0.020s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.006526 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452349 real 0m0.022s user 0m0.012s sys 0m0.000s Projected perplexity change from setting alpha=0.280321158690507 is 17.006972->17.0064966287094, reduction of 0.000475371290633575 Alpha value on iter 6 is 0.280321158690507 Iteration 7/7 of optimizing discounting parameters discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=1.750000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.350000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.000000 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 interpolate_ngrams: 60 words in wordslist interpolate_ngrams: 60 words in wordslist Perplexity over 11412.000000 words is 17.006845 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452750 real 0m0.019s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.006575 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452414 real 0m0.021s user 0m0.012s sys 0m0.000s Perplexity over 11412.000000 words is 17.006336 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.452127 real 0m0.022s user 0m0.012s sys 0m0.000s Projected perplexity change from setting alpha=0.690827338145686 is 17.006575->17.0062591109755, reduction of 0.000315889024498972 Alpha value on iter 7 is 0.690827338145686 Final config is: D=0.4 tau=0.45 phi=2.0 D=0.3 tau=0.576144521410728 phi=2.69082733814569 D=1.36 tau=0.935 phi=2.7 Discounting N-grams. discount_ngrams: for n-gram order 1, D=0.400000, tau=0.450000 phi=2.000000 discount_ngrams: for n-gram order 2, D=0.300000, tau=0.576145 phi=2.690827 discount_ngrams: for n-gram order 3, D=1.360000, tau=0.935000 phi=2.700000 Computing final perplexity Building ARPA LM (perplexity computation is in background) interpolate_ngrams: 60 words in wordslist interpolate_ngrams: 60 words in wordslist Perplexity over 11412.000000 words is 17.006029 Perplexity over 10833.000000 words (excluding 579.000000 OOVs) is 17.451754 17.006029 輸入 local/timit_format_data.sh 終端顯示: Creating L.fst Done creating L.fst Creating L_disambig.fst Done creating L_disambig.fst Creating G.fst arpa2fst - \data\ Processing 1-grams Processing 2-grams Connected 0 states without outgoing arcs. remove_oovs.pl: removed 0 lines. G.fst created. How stochastic is it ? fstisstochastic data/lang_test/G.fst 0 -0.0900995 fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst How stochastic is LG.fst. fstisstochastic data/lang_test/G.fst 0 -0.0900995 fstisstochastic fsttablecompose data/lang/L.fst data/lang_test/G.fst 0 -0.0900994 How stochastic is LG_disambig.fst. fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst fstisstochastic 0 -0.0900994 First few lines of lexicon FST: 0 1 <eps> <eps> 0.356674939 0 1 sil <eps> 1.20397282 1 2 aa AA 1.20397282 1 1 aa AA 0.356674939 1 1 ae AE 0.356674939 1 2 ae AE 1.20397282 1 1 ah AH 0.356674939 1 2 ah AH 1.20397282 1 1 ao AO 0.356674939 1 2 ao AO 1.20397282 timit_format_data succeeded. 輸入:mfccdir=mfccs for test in train test dev ; do > steps/make_mfcc.sh data/$test exp/make_mfcc/$test $mfccdir 4 > done 終端顯示: Succeeded creating MFCC features for train Succeeded creating MFCC features for test Succeeded creating MFCC features for dev 2 訓(xùn)練單音素系統(tǒng),終端輸入: steps/train_mono.sh data/train data/lang exp/mono 終端顯示: Computing cepstral mean and variance statistics Initializing monophone system. Compiling training graphs Pass 0 Pass 1 Aligning data Pass 2 Aligning data Pass 3 Aligning data Pass 4 Aligning data Pass 5 Aligning data Pass 6 Aligning data Pass 7 Aligning data Pass 8 Aligning data Pass 9 Aligning data Pass 10 Aligning data Pass 11 Pass 12 Aligning data Pass 13 Pass 14 Pass 15 Aligning data Pass 16 Pass 17 Pass 18 Pass 19 Pass 20 Aligning data Pass 21 Pass 22 Pass 23 Pass 24 Pass 25 Aligning data Pass 26 Pass 27 Pass 28 Pass 29 scripts/mkgraph.sh --mono data/lang_test exp/mono exp/mono/graph(制圖) 終端顯示: fsttablecompose data/lang_test/L_disambig.fst data/lang_test/G.fst fstminimizeencoded fstdeterminizestar --use-log=true fstisstochastic data/lang_test/tmp/LG.fst 0 -0.0901494 warning: LG not stochastic. fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test/tmp/disambig_phones.list --write-disambig-syms=data/lang_test/tmp/disambig_ilabels_1_0.list data/lang_test/tmp/ilabels_1_0 fstisstochastic data/lang_test/tmp/CLG_1_0.fst 0 -0.0901494 warning: CLG not stochastic. make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.list --transition-scale=1.0 data/lang_test/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl fsttablecompose exp/mono/graph/Ha.fst data/lang_test/tmp/CLG_1_0.fst fstdeterminizestar --use-log=true fstminimizeencoded fstrmsymbols exp/mono/graph/disambig_tid.list fstrmepslocal fstisstochastic exp/mono/graph/HCLGa.fst 0 -0.0901494 HCLGa is not stochastic add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl 3 解碼測試的數(shù)據(jù)集,輸入 for test in dev test ; do steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test & done 終端顯示: [1] 16368 [2] 16369 3.1計算結(jié)果,輸入: scripts/average_wer.sh exp/mono/decode_*/wer > exp/mono/wer 終端顯示: [1]- 完成 steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test [2]+ 完成 steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test 4 從單音素系統(tǒng)中獲得排列 創(chuàng)建排列用以訓(xùn)練其他系統(tǒng),如ANN-HMM。 輸入: steps/align_deltas.sh data/train data/lang exp/mono exp/mono_ali_train 終端顯示: Computing cepstral mean and variance statistics Aligning all training data Done. steps/align_deltas.sh data/dev data/lang exp/mono exp/mono_ali_dev 方法二:修改相應(yīng)的TIMIT路徑之后,直接運行run.sh TIMIT S4實例此腳本是用于構(gòu)建一個音位識別器 WORKDIR=/home/zhangju/ss4(自己找個有空間的路徑作為WORKDIR) mkdir -p $WORKDIR cp -r conf local utils steps path.sh $WORKDIR cd $WORKDIR . path.sh(此文件中的環(huán)境變量KALDIROOT要自己修改路徑,改到自己裝的kaldi文件中。KALDIROOT=/home/mayuan/kaldi-trunk(我用nano改的。)) local/timit_data_prep.sh --config-dir=$PWD/conf --corpus-dir=/home/zhangju/TIMIT --work-dir=$WORKDIR |
|