Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
229 views
in Technique[技术] by (71.8m points)

python - Loading pretrained Word2vec model

I have some pretrained Word2vec models that I need to load both in Python and R. They are in txt.gz extension. I extracted them by 7-zip and could get the txt files. I use the following code to load them in Python:

model = gensim.models.KeyedVectors.load_word2vec_format('syn0_ngram_1900_1909_full.txt', binary = False)

However, I get these errors:

File "<ipython-input-6-31e3fee59511>", line 1, in <module>
    model = gensim.models.KeyedVectors.load_word2vec_format('syn0_ngram_1900_1909_full.txt' , binary = False)

  File "C:UsersPSUanaconda3libsite-packagesgensimmodelskeyedvectors.py", line 1547, in load_word2vec_format
    return _load_word2vec_format(

  File "C:UsersPSUanaconda3libsite-packagesgensimmodelsutils_any2vec.py", line 277, in _load_word2vec_format
    vocab_size, vector_size = (int(x) for x in header.split())  # throws for invalid file format

  File "C:UsersPSUanaconda3libsite-packagesgensimmodelsutils_any2vec.py", line 277, in <genexpr>
    vocab_size, vector_size = (int(x) for x in header.split())  # throws for invalid file format

ValueError: invalid literal for int() with base 10: '-1.552761644124984741e-01'

The first 3 lines of the file look like this:

-1.552761644124984741e-01 -4.447535425424575806e-02 2.490501850843429565e-01 6.936315447092056274e-02 4.821906611323356628e-02 1.374670863151550293e-01 -3.902152925729751587e-02 -9.022397547960281372e-02 -2.184277474880218506e-01 2.209668904542922974e-01 -3.469136059284210205e-01 -1.411392092704772949e-01 -4.453907907009124756e-01 -1.025922745466232300e-01 8.120749890804290771e-02 -4.073899090290069580e-01 -1.905823498964309692e-02 -8.645015209913253784e-02 -7.905063778162002563e-02 7.617127150297164917e-02 1.504308432340621948e-01 8.758410811424255371e-03 3.396979570388793945e-01 -2.586390674114227295e-01 2.386739850044250488e-01 1.265343427658081055e-01 2.472167760133743286e-01 3.401717543601989746e-02 6.056435406208038330e-02 1.221914887428283691e-01 -8.905990421772003174e-02 -9.887123107910156250e-02 -9.832112491130828857e-02 -7.572189718484878540e-02 -7.369377184659242630e-03 1.302516758441925049e-01 -1.231815069913864136e-01 2.606352046132087708e-02 1.441473066806793213e-01 -1.717498451471328735e-01 9.777273982763290405e-03 6.302291993051767349e-03 8.990994654595851898e-03 -1.062645390629768372e-01 1.878743618726730347e-01 5.820669233798980713e-02 -1.301994770765304565e-01 1.007045060396194458e-01 -8.061264455318450928e-02 -1.923392526805400848e-02 9.648428112268447876e-02 -2.001685947179794312e-01 -1.039923876523971558e-01 1.369088292121887207e-01 3.344058617949485779e-02 -8.220246434211730957e-02 2.154016494750976562e-01 -1.533902585506439209e-01 -9.639452397823333740e-02 -1.077244579792022705e-01 3.839006647467613220e-02 -9.669522196054458618e-02 1.057831645011901855e-01 -1.731183379888534546e-01 -1.823752373456954956e-01 1.329025924205780029e-01 1.256246417760848999e-01 -5.657993257045745850e-02 -2.921316400170326233e-02 -6.282752007246017456e-02 1.006662324070930481e-01 6.356491148471832275e-02 1.589829623699188232e-01 1.770073324441909790e-01 4.362170770764350891e-02 -1.918367296457290649e-01 -3.448316827416419983e-02 -5.027920752763748169e-02 9.733915328979492188e-02 3.966502845287322998e-01 3.811245039105415344e-02 -2.094386219978332520e-01 3.425932824611663818e-01 -2.924242429435253143e-02 2.225339598953723907e-02 2.787167727947235107e-01 1.680488288402557373e-01 7.655870169401168823e-02 3.952257335186004639e-02 -2.619512081146240234e-01 -3.033895492553710938e-01 -5.149876475334167480e-01 -1.642060428857803345e-01 -1.959302574396133423e-01 1.126131117343902588e-01 -2.267295867204666138e-01 2.400911971926689148e-02 4.052775725722312927e-02 -3.044707141816616058e-02 -3.633485138416290283e-01 -2.818429283797740936e-02 -4.622202217578887939e-01 9.291686117649078369e-02 -2.956802845001220703e-01 1.862034201622009277e-01 1.242911815643310547e-01 1.026049628853797913e-01 1.160985007882118225e-01 -1.380904614925384521e-01 1.792961508035659790e-01 1.492877304553985596e-01 2.356165647506713867e-01 -2.932927012443542480e-02 1.063521653413772583e-01 3.353847563266754150e-01 1.908604428172111511e-02 3.782559633255004883e-01 -1.517397463321685791e-01 3.612821102142333984e-01 1.607065051794052124e-01 9.656509757041931152e-02 1.245319694280624390e-01 1.315144896507263184e-01 7.511320710182189941e-02 -4.755245521664619446e-02 -2.734144330024719238e-01 3.033797740936279297e-01 5.215175449848175049e-03 2.141999304294586182e-01 -1.597059220075607300e-01 3.182544559240341187e-02 4.125118851661682129e-01 -2.834559679031372070e-01 2.971411049365997314e-01 2.584041953086853027e-01 2.266484946012496948e-01 1.358106434345245361e-01 -8.042504638433456421e-02 -2.925538420677185059e-01 6.947112828493118286e-02 3.138780593872070312e-01 -1.517586410045623779e-01 -2.561317682266235352e-01 -1.843494027853012085e-01 2.936672978103160858e-02 -1.237718015909194946e-01 6.020113825798034668e-02 5.157970264554023743e-02 1.483027786016464233e-01 1.515904515981674194e-01 -7.338423281908035278e-02 1.898889243602752686e-02 2.750496566295623779e-02 -6.313492357730865479e-02 -2.602603659033775330e-02 -4.748337436467409134e-03 3.420833945274353027e-01 -5.720657855272293091e-02 -2.232243567705154419e-01 4.226108267903327942e-02 6.031884625554084778e-02 1.539045125246047974e-01 8.576720207929611206e-02 1.011675968766212463e-01 -3.795365989208221436e-01 -3.146133571863174438e-02 1.349445134401321411e-01 2.983746826648712158e-01 -2.938828170299530029e-01 1.533054113388061523e-01 -4.229364991188049316e-01 9.155936539173126221e-02 -2.974963048473000526e-03 -1.385585069656372070e-01 -1.053368579596281052e-02 1.153212636709213257e-01 3.379225432872772217e-01 -9.703439474105834961e-02 -1.578260511159896851e-01 -6.252604722976684570e-02 1.598290950059890747e-01 2.294627018272876740e-03 6.054456159472465515e-02 1.103171482682228088e-01 1.407995820045471191e-02 1.977602243423461914e-01 -7.971014082431793213e-02 6.747842580080032349e-02 -7.176994532346725464e-02 3.453086316585540771e-02 1.144322603940963745e-01 -1.870087534189224243e-01 -1.876662820577621460e-01 6.476462818682193756e-03 -8.064353466033935547e-02 -1.166440173983573914e-01 -3.607030212879180908e-02 -2.510503865778446198e-02 6.253489851951599121e-02 1.802610009908676147e-01 4.245756864547729492e-01 -1.071699485182762146e-01 -1.976074464619159698e-02 7.162892073392868042e-02 -2.126150727272033691e-01 -1.831589490175247192e-01 -7.786697894334793091e-02 1.421018242835998535e-01 2.083165943622589111e-01 -9.992305934429168701e-02 7.392542809247970581e-02 9.227126836776733398e-02 -1.524462252855300903e-01 2.111838459968566895e-01 1.633472144603729248e-01 6.497748196125030518e-02 6.825347244739532471e-02 -3.643653988838195801e-01 1.698636859655380249e-01 6.742136925458908081e-02 2.124408334493637085e-01 -2.609764039516448975e-01 -4.775075241923332214e-02 -1.276874262839555740e-02 -9.566855616867542267e-03 -7.416314631700515747e-02 -1.711301803588867188e-01 2.018006443977355957e-01 -4.967777058482170105e-03 7.954392582178115845e-02 -8.138674497604370117e-02 2.610500156879425049e-01 3.377711772918701172e-02 -2.635057568550109863e-01 7.423927634954452515e-02 -2.577809691429138184e-01 -7.702536880970001221e-02 1.627112627029418945e-01 1.897031962871551514e-01 -1.299263685941696167e-01 1.664789579808712006e-02 -6.737360358238220215e-02 -2.183234542608261108e-01 2.616149485111236572e-01 -1.861911714076995850e-01 -8.766605705022811890e-02 5.951049551367759705e-02 3.398019671440124512e-01 1.241989135742187500e-01 1.123771518468856812e-01 2.735071256756782532e-02 7.581159472465515137e-03 -1.705877929925918579e-01 9.298118948936462402e-02 -5.501312017440795898e-02 2.464835159480571747e-02 1.904888302087783813e-01 1.251959949731826782e-01 -9.753731638193130493e-02 4.099815338850021362e-02 -3.088685572147369385e-01 4.752117022871971130e-02 -1.016708761453628540e-01 2.049167454242706299e-01 -1.110423356294631958e-01 -2.558538317680358887e-02 9.703662991523742676e-02 1.440881937742233276e-01 -1.499230116605758667e-01 4.630966186523437500e-01 1.560464948415756226e-01 -2.473618537187576294e-01 7.339747250080108643e-02 -1.125243376009166241e-03 2.308040857315063477e-03 -7.349326461553573608e-02 -5.643999949097633362e-02 -1.791801899671554565e-01 3.374390304088592529e-02 5.359465628862380981e-02 4.016261696815490723e-01 -8.631563186645507812e-02 -1.041909903287887573e-01 -9.027398191392421722e-03 7.635752111673355103e-02 -1.177581623196601868e-01 6.990105658769607544e-02 -1.495847105979919434e-01 -1.948498487472534180e-01 -1.003706827759742737e-01 2.158978767693042755e-02 2.253228724002838135e-01 -8.305017650127410889e-02 9.877178817987442017e-02 -1.782058775424957275e-01 -4.364309012889862061e-01 2.809965051710605621e-02 5.815667286515235901e-02 9.305762499570846558e-02 9.939935058355331421e-02
-2.755518853664398193e-01 7.426643371582031250e-02 1.305104941129684448e-01 1.733209006488323212e-02 3.392809331417083740e-01 4.914091154932975769e-02 -5.487316101789474487e-02 -2.893702983856201172e-01 -3.995743691921234131e-01 1.019903868436813354e-01 -5.586374923586845398e-02 -2.909922003746032715e-01 -1.379316449165344238e-01 -1.213544141501188278e-02 -2.101085036993026733e-01 -4.060855805873870850e-01 2.363941520452499390e-01 -1.304764747619628906e-01 -1.898821741342544556e-01 7.960485666990280151e-02 6.144599989056587219e-02 -8.303866721689701080e-03 1.456501632928848267e-01 -1.511054039001464844e-01 3.446572422981262207e-01 1.809655129909515381e-01 3.376641869544982910e-01 -1.289701908826828003e-01 1.942324079573154449e-02 1.295022666454315186e-01 1.819744110107421875e-01 9.251490980386734009e-02 1.657947003841400146e-01 -4.376604557037353516e-01 2.938240170478820801e-01 -1.873110830783843994e-01 -1.355587989091873169e-01 -2.293781042098999023e-01 -9.990473277866840363e-03 -1.429447233676910400e-01 4.837138950824737549e-02 4.135683923959732056e-02 1.273282319307327271e-01 -1.000547260046005249e-01 3.860374540090560913e-02 3.943286091089248657e-02 7.455765455961227417e-02 -1.942279636859893799e-01 1.055958718061447144e-01 -1.248219236731529236e-01 6.977072358131408691e-02 8.551878482103347778e-02 4.604674875736236572e-02 -1.508192718029022217e-01 -2.823450267314910889e-01 -1.705607175827026367e-01 1.018783375620841980e-01 9.879937022924423218e-02 -4.601259529590606689e-02 -1.719024218618869781e-02 -1.294963657855987549e-01 -5.334546416997909546e-02 1.102923452854156494e-01 3.475880622863769531e-02 3.030833788216114044e-02 3.598376810550689697e-01 1.075935140252113342e-01 1.747883707284927368e-01 2.600349187850952148e-01 -4.294164106249809265e-02 3.064307570457458496e-01 3.595127537846565247e-02 8.350577205419540405e-02 4.761104285717010498e-02 1.397927701473236084e-01 6.383475847542285919e-03 -1.242930628359317780e-02 -6.513260304927825928e-02 -1.765230298042297363e-01 2.290750741958618164e-01 1.070840135216712952e-01 -1.611845940351486206e-01 2.256397455930709839e-01 3.962266817688941956e-02 -1.251329332590103149e-01 -8.839791268110275269e-02 -8.401984721422195435e-02 -1.068911850452423096e-01 4.183220565319061279e-01 -1.719796285033226013e-02 -1.992868930101394653e-01 -1.439917534589767456e-01 -3.158213943243026733e-02 1.782516241073608398e-01 -2.040623277425765991e-01 -2.465122565627098083e-02 3.390240948647260666e-03 -2.063101902604103088e-02 3.736664727330207825e-02 -1.950853466987609863e-01 7.347257435321807861e-02 -3.684818744659423828e-01 -3.807673603296279907e-02 -6.298073381185531616e-02 3.570814132690429688e-01 1.056838855147361755e-01 -6.606206297874450684e-02 1.103219836950302124e-01 1.340708583593368530e-01 1.316183954477310181e-01 1.801468431949615479e-01 2.364787608385086060e-01 -2.933555981144309044e-03 1.394167244434356689e-01 1.410789489746093750e-01 2.110916227102279663e-01 -3.954877331852912903e-02 -1.67269378900

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...