I have some CSV files that I would be keen to parse with numpy.recfromcsv in one go with a transposed output where basically the light names become the field names. The reason to do it in one go is that I have a wrapper function above numpy.recfromcsv that converts the data to a dict for further usage down the line.
Wavelength(nm);380;384;388;392;396;400;404;408;412;416;420;424;428;432;436;440;444;448;452;456;460;464;468;472;476;480;484;488;492;496;500;504;508;512;516;520;524;528;532;536;540;544;548;552;556;560;564;568;572;576;580;584;588;592;596;600;604;608;612;616;620;624;628;632;636;640;644;648;652;656;660;664;668;672;676;680;684;688;692;696;700;704;708;712;716;720;724;728;732;736;740;744;748;752;756;760;764;768;772;776;780
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Arri_650Plus_TU_L1_Spot;0.0649420138;0.0713359985;0.0767221834;0.0828162664;0.0896337031;0.0954200476;0.1015809445;0.1083380365;0.1152099744;0.1226002112;0.1304201937;0.1388171648;0.1474312908;0.1553977485;0.1641032319;0.1729020817;0.1813312407;0.1906956427;0.2004236261;0.2103836711;0.2197347994;0.2301930545;0.2408191448;0.2506142806;0.2611302546;0.2733759514;0.284116684;0.2948628201;0.3056525569;0.3177146507;0.3294133586;0.3397499278;0.3505554406;0.3636232187;0.3766502518;0.3874038118;0.3981092712;0.4107713627;0.4243557832;0.4382433069;0.4498096969;0.461665414;0.4747047349;0.4889017319;0.5024089331;0.5129562103;0.5228344876;0.533534703;0.5462038888;0.5604267305;0.5725309448;0.5825399414;0.5933902825;0.60572004;0.6208132236;0.6357598877;0.6471384375;0.6565189548;0.6650144674;0.6764585843;0.688473714;0.7009057167;0.7127206503;0.7223163747;0.7309630838;0.7397939032;0.751543739;0.7653598803;0.7791136072;0.7895549803;0.7978630925;0.8052200741;0.8125588639;0.8212071113;0.8317661593;0.843195618;0.8515469598;0.8606753344;0.8662945571;0.8712362469;0.8758733751;0.8834956388;0.8940467835;0.9055033544;0.9184418225;0.9269943889;0.9335038181;0.9390080294;0.9393642612;0.9434293798;0.9469191008;0.9500890733;0.9577203179;0.9644588923;0.9720790512;0.9764103807;0.9801371874;0.9839499631;0.9867535093;0.9919074935;1
Arri_650Plus_TU_L1_Flood;0.0709192358;0.0780011542;0.0838268275;0.0904401642;0.0978312313;0.1038824584;0.1105203444;0.117754164;0.1248613821;0.1327810298;0.1409765932;0.1496186975;0.1588363439;0.1671464506;0.175832878;0.1850992276;0.1939894234;0.2032968126;0.2131465498;0.2237247224;0.2331370825;0.2432010315;0.2542785682;0.2646114121;0.2748859286;0.2867974914;0.2979924761;0.3093697767;0.3195529647;0.331072474;0.3431973214;0.3542273202;0.3646865813;0.376864611;0.3898278167;0.4015911308;0.4124504414;0.4243576775;0.4370513508;0.4513418047;0.4640631171;0.476030028;0.4880722902;0.5012602469;0.5149219407;0.5266367587;0.5369916948;0.5465622405;0.5575347206;0.5713908865;0.584423412;0.5954682593;0.606238159;0.6171773733;0.6307393028;0.6453821455;0.6576896668;0.6683389517;0.6766358995;0.6866960335;0.6968170975;0.7082208201;0.720769518;0.7318943339;0.7410853258;0.7491866414;0.7591625495;0.7709535145;0.7839377753;0.7949955919;0.8046440567;0.8130366929;0.8198911863;0.8268070804;0.835366696;0.8451332182;0.8532573005;0.8636196782;0.8708370915;0.8764507633;0.8802630265;0.8861248906;0.8946595418;0.9041559755;0.9167776105;0.9261045051;0.9344079806;0.941274279;0.9423009827;0.9450876305;0.9467080756;0.9479125565;0.9538826893;0.9605063968;0.9688127212;0.9747311526;0.9810755784;0.9859659902;0.9887573094;0.9932690683;1
Arri_650Plus_TU_L2_Spot;0.0703748707;0.0772490608;0.0822957214;0.0882127866;0.0957963985;0.1021428783;0.1080047715;0.1141684019;0.1214126809;0.1300237476;0.1384609437;0.1459547675;0.1538504992;0.1626621958;0.1728283505;0.1822027482;0.189916872;0.1976659963;0.2068885607;0.2187883328;0.2302677869;0.240198871;0.2483411778;0.2567126052;0.2674389781;0.2809526727;0.2936596139;0.3059000829;0.3147620996;0.3223605547;0.3319088166;0.3442246404;0.3584124133;0.3732600519;0.3856003077;0.3945126405;0.4024199421;0.4118967375;0.4236249755;0.439183328;0.455950324;0.472148459;0.4850815618;0.4948075339;0.5022534351;0.5088242211;0.5186921604;0.5317015467;0.5468822222;0.5635751373;0.578352849;0.5904501179;0.6001563403;0.6067695342;0.613679717;0.622415103;0.6333244952;0.6490413025;0.6648575842;0.6815336875;0.6951308975;0.7058881115;0.7141568509;0.7202305036;0.7243338689;0.7279247354;0.7347784773;0.745966777;0.7611295304;0.7766200518;0.7925236528;0.8079664016;0.8199992547;0.8286494418;0.8347172968;0.8375013496;0.8348916749;0.8360573647;0.8371733167;0.8417985862;0.84996016;0.862369925;0.8776545289;0.8928098123;0.9095353771;0.921835031;0.9318786228;0.9404007037;0.9410916045;0.9409862386;0.9361971456;0.9286087144;0.9235082081;0.9214437767;0.9259588882;0.9330881141;0.9454395063;0.9599214689;0.9724859922;0.9862913247;1
Arri_650Plus_TU_L2_Flood;0.0748918629;0.0827435253;0.088379091;0.0941350792;0.1018401903;0.1088519627;0.1157129505;0.1220543146;0.1287878298;0.1376223232;0.1471787969;0.1556335309;0.1633174002;0.171340317;0.1816357849;0.192181883;0.2010292799;0.2088419391;0.2170423328;0.2282683327;0.2402919894;0.2516450482;0.2610049941;0.2689984594;0.2782134044;0.2908451121;0.303998674;0.3175467462;0.3278599948;0.3359702378;0.3441924801;0.354635445;0.3680957438;0.3833920268;0.3970895352;0.4076867315;0.4165702579;0.4253448458;0.434863311;0.4486204513;0.4649510531;0.4820688746;0.4963805587;0.5077244805;0.5167104257;0.5232170563;0.5306466349;0.5409697143;0.5548585474;0.5715871149;0.5870469434;0.6005594924;0.6121972442;0.6207156119;0.6281949133;0.6354022681;0.64337715;0.6561790847;0.6710964529;0.6877550869;0.7017979987;0.7135975674;0.7237851988;0.7320045173;0.7377853697;0.7419342319;0.7467129367;0.7543477159;0.7668209629;0.7806719159;0.7959697793;0.8117495054;0.8242823156;0.8339266152;0.841670204;0.8467285409;0.8466601832;0.8487887669;0.8493197503;0.8514038088;0.8554578594;0.8654597016;0.8788954304;0.8931539767;0.909913852;0.922612058;0.9332210358;0.9428380427;0.9454668351;0.9478930146;0.9454491894;0.9396165869;0.9355576156;0.9323700733;0.9341472136;0.937918952;0.9480471236;0.9603773289;0.9727325525;0.9867021182;1
Arri_650Plus_TU_L3_Spot;0.0618733977;0.0676603902;0.0731842722;0.0791780487;0.085032737;0.0906196384;0.0974306984;0.1034144335;0.1088490926;0.1167053001;0.1251706938;0.1319945557;0.1390186165;0.148015902;0.1578329502;0.1650255148;0.1716748224;0.1814098328;0.1927763907;0.2030027796;0.2098682762;0.21789867;0.2303785652;0.243154842;0.2528824105;0.2606289362;0.2690370484;0.2827437866;0.2974806463;0.3085094052;0.3158482449;0.3231239389;0.3347921256;0.3510714598;0.3665108969;0.3767849104;0.3836415177;0.3915547722;0.4037429581;0.42143327;0.4393381221;0.4535186974;0.462006382;0.4677497788;0.4760367858;0.4894177229;0.5085120329;0.5250102547;0.5369630044;0.5452098912;0.5494923476;0.5546163411;0.5674890293;0.5858806401;0.6064034887;0.6244746643;0.6361721101;0.6422368946;0.6441018633;0.6480249691;0.6563331971;0.6715333102;0.6920708952;0.7106273273;0.7254352779;0.7349028228;0.7396943235;0.7400649558;0.7423801132;0.7488859058;0.7627429315;0.7830268691;0.8021249786;0.817411478;0.8292332365;0.8356594358;0.8338883181;0.8325055153;0.8295889675;0.8311029285;0.8394494302;0.8552089368;0.8741404914;0.8923414673;0.9102820313;0.9213583551;0.9280303563;0.9310447551;0.9242712711;0.9173364667;0.9098310833;0.9057701152;0.9123829983;0.9259099773;0.9431240259;0.9590545489;0.9761861868;0.9883615614;0.9951607263;0.9994499727;1
Arri_650Plus_TU_L3_Flood;0.0714854097;0.0777523225;0.0833954177;0.0904600296;0.0973492772;0.1026681985;0.1098832447;0.1176664777;0.1234786669;0.1307335696;0.1402395154;0.1489848547;0.1561974735;0.1640111677;0.1746648012;0.1842182042;0.1912457162;0.1993578543;0.2104970316;0.2229061786;0.2318035903;0.2386375325;0.2486922915;0.2621891457;0.2745678691;0.2842963896;0.2912387868;0.3020111649;0.3166504901;0.3303259761;0.3403548159;0.3474832225;0.3558107042;0.3696246536;0.3860234474;0.3995000868;0.4091159515;0.4166846915;0.4250204808;0.439420538;0.4574142129;0.4745795509;0.4865878552;0.4942588007;0.5003283396;0.508674034;0.5241321289;0.5413485203;0.556017238;0.5679254247;0.5752564906;0.5795690239;0.5866245552;0.6001152318;0.6191003189;0.6382347345;0.6529403121;0.6631638483;0.6680636498;0.6716942209;0.6750690953;0.6834815591;0.7004970105;0.7192609531;0.7357458752;0.7483015252;0.7572050781;0.7613939369;0.7636442546;0.7653571087;0.7723309465;0.7868354241;0.8042740302;0.8203959411;0.8338473902;0.8433902964;0.84612672;0.848968406;0.8476680613;0.8462435454;0.8466918571;0.8556390784;0.8708202118;0.8878421628;0.9062886027;0.9189038442;0.9282102215;0.9349769413;0.933405243;0.9303961134;0.9233441922;0.9157002643;0.9146478333;0.9208949876;0.9344269804;0.9488866719;0.9657966901;0.979676154;0.9881745701;0.995800344;1
Arri_650Plus_TU_L4_Spot;0.066328486;0.0728696763;0.0783385052;0.0844958501;0.0914624486;0.097241963;0.1035049709;0.1105176182;0.117386944;0.1249070605;0.1330220049;0.1413874719;0.1501592229;0.1584626763;0.1670392361;0.1758246628;0.1847020837;0.1939977324;0.2034931763;0.2137960153;0.2234397177;0.2334413761;0.2439419777;0.2542239649;0.2649125466;0.2765500499;0.2869705389;0.2984054686;0.3094295019;0.3206040724;0.3316302827;0.3426546669;0.3542953729;0.3666125475;0.3780827454;0.3890463025;0.4012738245;0.4143285678;0.4261338259;0.4383366222;0.4511788261;0.4654022289;0.4782084053;0.4896931287;0.5014418552;0.5133119504;0.5262304755;0.5370207578;0.5467479679;0.5582942786;0.5710920751;0.5843865744;0.597343032;0.6077770728;0.619324122;0.6328600262;0.6458157996;0.6582882637;0.667484913;0.677081142;0.6864457375;0.6976186706;0.7107275053;0.7221880827;0.7314826938;0.7394488356;0.7497138946;0.7625332297;0.7762174348;0.7870092528;0.7954331304;0.8024069741;0.8093737267;0.8182693974;0.829141499;0.8402120549;0.8470688546;0.854288074;0.8584631973;0.8635310401;0.8698837598;0.8798864633;0.891717107;0.9020368279;0.9112539726;0.9157994725;0.9198755001;0.9258007744;0.930245102;0.9388154236;0.9452375072;0.9484301871;0.9516597959;0.9525771902;0.9553770314;0.9577398052;0.9635993174;0.9727579752;0.9813576292;0.9915127962;1
Arri_650Plus_TU_L4_Flood;0.0745707824;0.0817642667;0.0878226703;0.0946547006;0.1023386654;0.1086113348;0.1154143995;0.1230703367;0.130298353;0.1384380407;0.1470129989;0.1559244441;0.1652778763;0.1739633772;0.1829936914;0.1921328051;0.2012842737;0.2110389317;0.2207816548;0.2313695871;0.2413701585;0.2516039836;0.2621242572;0.2727715671;0.2836606034;0.2954251833;0.3059758705;0.3175784208;0.3287023611;0.3398573324;0.3508645841;0.3619949279;0.3736290983;0.3859872838;0.3972478376;0.4082515089;0.4206233568;0.4336305051;0.4450532918;0.4572904011;0.4702687873;0.484544056;0.4969470771;0.5081496064;0.5197844047;0.5320472436;0.5446661522;0.5545355016;0.5637511505;0.5757218928;0.5887245804;0.601732135;0.6138475093;0.6234964919;0.6351229156;0.6490740326;0.6621226436;0.6735525466;0.6815617926;0.6906155811;0.7004951678;0.712199559;0.7246978207;0.7350345816;0.7429746804;0.7503643292;0.7611189836;0.7747821434;0.7881492274;0.7974202025;0.8044841653;0.8100988315;0.8173673325;0.827069166;0.8383109877;0.8488342324;0.8539095724;0.8594443563;0.8627125603;0.8675435447;0.875088618;0.8858484789;0.8972077849;0.906067646;0.9136986849;0.917006084;0.920556399;0.9269859961;0.9322921544;0.9419085189;0.9478905422;0.9497564044;0.9520700773;0.951225567;0.9534198954;0.9556880434;0.9626981;0.9733706602;0.9833101577;0.9925356497;1
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Arri_1000Plus_TU_L1_Spot;0.0678901459;0.0744363574;0.0803133472;0.0863179999;0.0932729881;0.0997262036;0.105759511;0.1124609453;0.120149975;0.1277963146;0.1352105884;0.1441191177;0.1537527186;0.16127252;0.169289091;0.1794796986;0.1887761839;0.1966821995;0.2057022483;0.2176029095;0.2279501263;0.2358679152;0.2452664945;0.2576474259;0.2696144301;0.2794115939;0.287668489;0.300059491;0.3139734644;0.3250689786;0.3332905561;0.3423684729;0.3554347395;0.3708846653;0.383075522;0.3908709152;0.3995694835;0.4132188808;0.4291979286;0.4446334397;0.4553247091;0.4637665178;0.473964635;0.4889391073;0.5060692753;0.5201126095;0.5298126922;0.5353124568;0.542194233;0.5557188599;0.5729214593;0.5896170283;0.6026162025;0.6105530552;0.617309786;0.6263014516;0.6399288181;0.6589243915;0.6746587978;0.6874423931;0.6940844739;0.697003196;0.7008134644;0.7112496;0.7275372605;0.7453747305;0.7621397383;0.7740956507;0.781505134;0.7838034027;0.7855947868;0.7917301178;0.8045219593;0.8207873152;0.8371054161;0.8499985201;0.8558648485;0.8608646097;0.8611559633;0.8602555909;0.8608347428;0.8696558243;0.8849562402;0.902193588;0.9201122923;0.931429103;0.9389649017;0.9434995268;0.9388259125;0.9349244958;0.9303647469;0.9291109626;0.9382539988;0.9512235838;0.9669248194;0.9791218778;0.990756558;0.9976485326;0.9996141318;1;0.9969475438
Arri_1000Plus_TU_L1_Flood;0.0751769682;0.0823713051;0.0887264593;0.0954088446;0.103016726;0.1097779911;0.116470737;0.1235338704;0.1317121566;0.1399141548;0.1477669523;0.1571503224;0.1672183222;0.1751722322;0.183548667;0.1940485188;0.2037420036;0.212106824;0.221343683;0.2335218828;0.2441759702;0.252347128;0.2618572996;0.2743501407;0.2865008972;0.2966577933;0.305020213;0.317393435;0.3312649349;0.3424498468;0.3509058031;0.3599977698;0.3729210693;0.3882714275;0.400668751;0.4086133459;0.4172034444;0.4308365525;0.4465231741;0.4618980628;0.4728413535;0.4814261801;0.4916210975;0.5062380142;0.5230000868;0.5369051989;0.5466840018;0.552178819;0.5589998738;0.5720628075;0.5886935295;0.6050075434;0.6179798039;0.6260849826;0.6329696174;0.6416688555;0.6546838775;0.6727480872;0.6878482209;0.7004611513;0.7073319866;0.7103810261;0.7140539131;0.7239420748;0.7392309404;0.7562764542;0.7724537858;0.7842002364;0.7916270514;0.7942489738;0.7960865364;0.8016829643;0.8134515642;0.828750432;0.8439976206;0.8562450083;0.8618662335;0.8671237915;0.8676494554;0.8668229533;0.8671435329;0.875298538;0.8894117358;0.9056560621;0.922596624;0.9336022434;0.9408401323;0.9455978067;0.9414307429;0.9381607249;0.9335976976;0.9318119067;0.9402318275;0.9518190937;0.9666738459;0.9783290772;0.9898730992;0.996996028;0.9992308908;1;0.997277527
Arri_1000Plus_TU_L2_Spot;0.0732902496;0.0803554314;0.0860344983;0.0925924901;0.0999929716;0.1059124125;0.1126635758;0.1198777246;0.1269672558;0.1350905255;0.143387638;0.1519586069;0.1614684597;0.1699107365;0.1783137267;0.187761666;0.1969717591;0.205872206;0.2156476981;0.2266965772;0.2359841599;0.2453246064;0.2566366545;0.2675388541;0.2772067456;0.2883364091;0.3000251308;0.312152012;0.3213918272;0.3316403967;0.3442025313;0.3562999725;0.3662548123;0.3766399399;0.3891124798;0.4023501131;0.4138367394;0.4241696091;0.4347928416;0.4492363438;0.4641294748;0.4767614292;0.4865681073;0.4970702375;0.5107435329;0.5249678448;0.5372972239;0.5452548053;0.5527513827;0.5645240464;0.5792201474;0.594019872;0.6062296714;0.6144325302;0.6234918138;0.6358343018;0.6505932246;0.666208583;0.6766911112;0.6847415261;0.6898138767;0.6966097203;0.7095318464;0.7254012621;0.7395873963;0.7494789825;0.7563084375;0.7611896479;0.7690869176;0.779947982;0.7944383723;0.8095057235;0.8196649008;0.8250604599;0.827928107;0.8299548637;0.833105523;0.8448775769;0.8581961514;0.8712549276;0.8801335981;0.885233753;0.8882718854;0.8897600291;0.8938441234;0.9009986478;0.913064811;0.9278845475;0.9378971086;0.9454448749;0.9462904181;0.9425536989;0.9393477169;0.9369460363;0.9399103618;0.9467371598;0.9605371515;0.9751159654;0.9852199787;0.9947571745;1
Arri_1000Plus_TU_L2_Flood;0.0759960961;0.0833769533;0.0893692927;0.0961601904;0.1038266054;0.1099999719;0.1170607086;0.1245159174;0.13172766;0.1401173151;0.1485641803;0.1573871569;0.1670882152;0.1756610632;0.1843023578;0.1939358142;0.2032611847;0.2122003965;0.2220686785;0.2332500944;0.2424634848;0.2519396836;0.2633645291;0.2742188776;0.2837472699;0.295020191;0.3067606067;0.3187804844;0.3279312543;0.3382083808;0.3507847383;0.3627001914;0.372493524;0.3827952145;0.3954993305;0.4086576597;0.4199318746;0.4300295266;0.4408093548;0.4554071791;0.4701329734;0.4825794203;0.4922157365;0.5027320632;0.5165729089;0.530785243;0.5427882825;0.5503765057;0.55775403;0.5696431843;0.5844933741;0.5991447362;0.6108211979;0.6186985543;0.6277177674;0.6401795122;0.6551214441;0.6705978906;0.6805815913;0.6882989783;0.6930207311;0.6997723309;0.7131060316;0.7290614319;0.7430204448;0.7522924315;0.7586021147;0.7631520277;0.7710587417;0.7824027769;0.7971189827;0.812214425;0.8217666929;0.8264059977;0.8289446385;0.8307470257;0.8341613498;0.8463142409;0.859872927;0.8729223026;0.8811860721;0.8858137303;0.8882560781;0.8893103692;0.8934944617;0.9010092062;0.9135987281;0.9288036219;0.9385146698;0.9455670145;0.9463685546;0.9420363204;0.9382171556;0.9357406492;0.9392052546;0.9464838671;0.9616492787;0.9760571631;0.9863566853;0.994961829;1
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Arri_2000Plus_TU_L1_Spot;0.0670002453;0.0731756421;0.0784066233;0.0868944314;0.0927751731;0.0994792471;0.1083878838;0.1142257206;0.1216522424;0.1298092343;0.1382588395;0.1465612517;0.1553635357;0.1630639256;0.1719442137;0.1803792918;0.1896054029;0.1992610923;0.2093012844;0.219470146;0.2297781508;0.2409639424;0.2507585352;0.262132303;0.2728559432;0.2841744851;0.2963244921;0.3060941551;0.3189620056;0.3304248364;0.3406569277;0.3533068472;0.364412878;0.3770472055;0.3893461029;0.3995546182;0.4121697352;0.4251492251;0.4369359883;0.450371852;0.4637698731;0.4768423175;0.4899643711;0.5023786729;0.514081876;0.5252420669;0.5373627929;0.5484850738;0.5602680961;0.5723280234;0.5845150182;0.5966899772;0.6079122735;0.6201100922;0.6330491021;0.645834341;0.6578774325;0.6692056064;0.6802207168;0.6931209457;0.7031055791;0.7129546509;0.7234280981;0.7341878726;0.7449998251;0.7551755129;0.765920908;0.7763890914;0.7884078973;0.7979620793;0.8080074584;0.8192503919;0.8277067362;0.8384803533;0.8452482399;0.8548921533;0.8580894934;0.8660822173;0.8743487965;0.8788065188;0.8889655108;0.8943845983;0.9021442624;0.913463443;0.9220478481;0.9326734939;0.9421090621;0.9524752333;0.9571585258;0.9607451761;0.9656125449;0.9673562649;0.9621900136;0.9686764189;0.973996622;0.9762697741;0.9805849303;0.989448926;0.9966148924;0.9959348104;1
Arri_2000Plus_TU_L1_Flood;0.0776753967;0.0861312478;0.0931573638;0.1005991538;0.1081903055;0.1162842242;0.1256211932;0.1320788514;0.1390118738;0.1480565117;0.1589335531;0.1689782245;0.1780066142;0.1853905564;0.194099528;0.2055791016;0.2170680816;0.2275283002;0.2365527459;0.2456720187;0.2559785416;0.2682191326;0.2822402944;0.2948917567;0.3056642922;0.3155049164;0.3243816165;0.3365224112;0.3505432279;0.3648953155;0.378071928;0.3894490138;0.3989802635;0.4083156827;0.4182635973;0.4308079367;0.4461235116;0.4620761888;0.4761476195;0.4893739687;0.5000786651;0.5091645465;0.5186054025;0.5296039394;0.5431915291;0.5576739137;0.5732961277;0.5863649019;0.5974392604;0.6080377815;0.6161403827;0.622703827;0.6302727693;0.6405292415;0.6556687588;0.6720919011;0.6879322096;0.7024365108;0.7140660185;0.7250838187;0.7329994067;0.7381840487;0.7421016004;0.7472481605;0.7546440482;0.7641902139;0.7765667931;0.7925701417;0.8077061863;0.8199568611;0.8313122064;0.8413943813;0.848672602;0.8539394182;0.8567558136;0.8589568856;0.8564333952;0.8598776481;0.8650206566;0.871636327;0.8819337321;0.8939136467;0.9068066953;0.9197841365;0.9318422364;0.941160685;0.9489192215;0.9569910547;0.9556703444;0.9577659993;0.9537818591;0.9484946217;0.9437122446;0.9407839611;0.9444404877;0.9488014263;0.9581698645;0.9689447803;0.9797587685;0.9896491323;1
Arri_2000Plus_TU_L2_Spot;0.0849908618;0.0934242262;0.1014674929;0.1099822002;0.1177662719;0.1253492666;0.134697463;0.142342758;0.1505975773;0.159637769;0.169856497;0.1800171878;0.1904933755;0.1997750064;0.209304278;0.2195967056;0.2303571462;0.2413313828;0.2523891267;0.2638755688;0.2752065316;0.2859739007;0.2984220703;0.3110820525;0.3235548365;0.336376107;0.3476023506;0.3600019504;0.3717992313;0.3840503067;0.3968868437;0.4097881732;0.422592282;0.4354193939;0.4470743949;0.4580069137;0.4703430633;0.4838711635;0.4971171506;0.5115211342;0.5255459198;0.5389505797;0.5514061384;0.5627478222;0.5737986877;0.584514454;0.5964193307;0.6080717277;0.6198356646;0.6329621056;0.6450896267;0.6561521142;0.6666775366;0.6766317536;0.6884804157;0.7002021719;0.7113928896;0.7227900287;0.733928084;0.7462221455;0.7568915945;0.7662926634;0.7754789701;0.7842146418;0.7927361138;0.8000912238;0.8075426722;0.8173322426;0.8278594127;0.8369424124;0.8467434591;0.8578046216;0.8671546628;0.8757761628;0.8832203537;0.8903071011;0.8922707743;0.897679717;0.9016222612;0.9043663377;0.9078215986;0.9133081896;0.9208924339;0.9296837496;0.9388365308;0.947253347;0.9554784366;0.9657392633;0.9675925388;0.9739426527;0.9740991903;0.9725736343;0.9725494563;0.9706558513;0.9727960135;0.9746475875;0.9764587147;0.9828435141;0.9875704484;0.993260968;1
Arri_2000Plus_TU_L2_Flood;0.0807115752;0.0894703458;0.0978699426;0.1061627135;0.1134289945;0.1210199135;0.1311422825;0.1390971162;0.1467088486;0.1549488498;0.1652000931;0.1761501872;0.1869601387;0.1956857765;0.2037413337;0.2134131758;0.2247962367;0.236737734;0.2478482829;0.2581276279;0.2673735789;0.2771588324;0.2903191557;0.304245815;0.3171887589;0.3292253492;0.3383798574;0.3483892624;0.3594848121;0.3729902733;0.3875002874;0.4012449413;0.4133500703;0.4242526571;0.4329438741;0.442105278;0.4544771669;0.4699193695;0.4851810575;0.5010043704;0.5146981453;0.5261441518;0.5357855811;0.544368079;0.5538931202;0.5652343121;0.5800344998;0.594509798;0.6080847351;0.6214793674;0.6325468091;0.6411901792;0.6484444601;0.6555268879;0.6658492166;0.6782501983;0.692181294;0.7073740594;0.7211370925;0.7347050672;0.7453691595;0.7534360406;0.7599210124;0.7654489258;0.770723779;0.7757173269;0.7824415027;0.7943932032;0.8082257223;0.8210994055;0.8341906965;0.8468468523;0.8570115297;0.8650534328;0.871078191;0.875781009;0.8748740527;0.8774277218;0.8791833208;0.8806829211;0.8852428769;0.8937849453;0.905407754;0.9185606003;0.9326556734;0.9441852595;0.9544661939;0.9655744567;0.9672118435;0.9728555981;0.9714115779;0.9684882194;0.9654110348;0.9608515281;0.9615879829;0.9613272893;0.9640521407;0.9712406305;0.9792932863;0.9894410019;1
I have tried to use various combinations of dtype and unpack=True without too much success so far.
I think that a better choice to read your data is not Numpy but Pandas.
The basic reasons are that:
Numpy arrays shoud have same type of each element,
but your first column is rather a "sample name" (string),
and in the result (transposed) array it should be the index column
(what is supported only in Pandas),
leaving all other column of float type.
So to read your data (so far without any transposition) use:
df = pd.read_csv('Input.csv', sep=';', index_col=0).dropna()
Note the final dropna() to drop rows with NaN, resulting from
rows containing only semi-colons.
Then, to get your tranposed array, run:
result = df.T
The result (limited to initial columns and rows) is:
Wavelength(nm) Arri_650Plus_TU_L1_Spot Arri_650Plus_TU_L1_Flood Arri_650Plus_TU_L2_Spot Arri_650Plus_TU_L2_Flood
380 0.064942 0.070919 0.070375 0.074892
384 0.071336 0.078001 0.077249 0.082744
388 0.076722 0.083827 0.082296 0.088379
392 0.082816 0.090440 0.088213 0.094135
396 0.089634 0.097831 0.095796 0.101840
Note that Wavelength(nm) is the name of the index column and each other
(regular) column has the sample name as its name.
Or you can do the whole job in one go, running:
result = pd.read_csv(io.StringIO(txt), sep=';', index_col=0).dropna().T
If you wish, you can take the underying Numpy array for further processing,
running:
result2 = result.values
But in this case you loose column names and row indices.
Such an array has:
row indices as consecutive integers (starting from 0), instead of
sample names,
column indices also as consecutive integers, instead of vawelengths.
Your choice which representation to choose.
The straight forward use of genfromtxt:
data = np.genfromtxt(txt.splitlines(), delimiter=';', dtype=None, encoding=None)
In [113]: data.shape
Out[113]: (20,)
In [115]: len(data.dtype.fields)
Out[115]: 102
Since just the first field is string, and the rest float, it might be better to load them separately:
In [116]: labels = np.genfromtxt(txt.splitlines(), delimiter=';', dtype=None, encoding=None, usecols=(0,))
In [117]: labels
Out[117]:
array(['Wavelength(nm)', '', 'Arri_650Plus_TU_L1_Spot',
'Arri_650Plus_TU_L1_Flood', 'Arri_650Plus_TU_L2_Spot',
'Arri_650Plus_TU_L2_Flood', 'Arri_650Plus_TU_L3_Spot',
'Arri_650Plus_TU_L3_Flood', 'Arri_650Plus_TU_L4_Spot',
'Arri_650Plus_TU_L4_Flood', '', 'Arri_1000Plus_TU_L1_Spot',
'Arri_1000Plus_TU_L1_Flood', 'Arri_1000Plus_TU_L2_Spot',
'Arri_1000Plus_TU_L2_Flood', '', 'Arri_2000Plus_TU_L1_Spot',
'Arri_2000Plus_TU_L1_Flood', 'Arri_2000Plus_TU_L2_Spot',
'Arri_2000Plus_TU_L2_Flood'], dtype='<U25')
In [118]: data = np.genfromtxt(txt.splitlines(), delimiter=';', dtype=None, encoding=None, usecols=range(1
...: ,102))
In [119]: data.dtype
Out[119]: dtype('float64')
In [120]: data.shape
Out[120]: (20, 101)
(The all ; rows are filled with nan.)
One could make a structured array using the labels as field names, but..., I wonder if that is the most useful array. The homogeneous data array may be better for some tasks, for example calculations across the 20 "fields".
unpack returns a list, one array per column of the regular data. With version 1.20 I get one array of the labels, and 101 arrays of the numeric values, all with (20,) shape.
Related
I have a pandas dataframe that I want to split into several smaller pieces of 100k rows each, then save onto the disk so that I can read in the data and process it one by one. I have tried using dill and hdf storage, as csv and raw text appears to take a lot of time.
I am trying this out on a subset of data with ~500k rows and five columns of mixed data. Two contains strings, one integers, one float and the final one contains bigram counts from sklearn.feature_extraction.text.CountVectorizer, stored as a scipy.sparse.csr.csr_matrix sparse matrix.
It is the last column that I am having problems with. Dumping and loading the data goes without issue, but when I try to actually access the data it is instead a pandas.Series object. Secondly, each row in that Series is a tuple which contains the whole dataset instead.
# Before dumping, the original df has 100k rows.
# Each column has one value except for 'counts' which has 1400.
# Meaning that df['counts'] give me a sparse matrix that is 100k x 1400.
vectorizer = sklearn.feature_extraction.text.CountVectorizer(analyzer='char', ngram_range=(2,2))
counts = vectorizer.fit_transform(df['string_data'])
df['counts'] = counts
df_split = pandas.DataFrame(np.column_stack([df['string1'][0:100000],
df['string2'][0:100000],
df['float'][0:100000],
df['integer'][0:100000],
df['counts'][0:100000]]),
columns=['string1','string2','float','integer','counts'])
dill.dump(df, open(file[i], 'w'))
df = dill.load(file[i])
print(type(df['counts'])
> <class 'pandas.core.series.Series'>
print(np.shape(df['counts'])
> (100000,)
print(np.shape(df['counts'][0])
> (496718, 1400) # 496718 is the number of rows in my complete data set.
print(type(df['counts']))
> <type 'tuple'>
Am I making any obvious mistake, or is there a better way to store this data in this format, one which isn't very time consuming? It has to be scalable to my full data containing 100 million rows.
df['counts'] = counts
this will produce a Pandas Series (column) with the # of elements equal to len(df) and where each element is a sparse matrix, which is returned by vectorizer.fit_transform(df['string_data'])
you can try to do the following:
df = df.join(pd.DataFrame(counts.A, columns=vectorizer.get_feature_names(), index=df.index)
NOTE: be aware this will explode your sparse matrix into densed (not sparse) DataFrame, so it will use much more memory and you can end up with the MemoryError
CONCLUSION:
That's why I'd recommend you to store your original DF and count sparse matrix separately
I can't post the data being imported, because it's too much. But, it has both number and string fields and is 5543 rows and 137 columns. I import data with this code (ndnames and ndtypes holds the column names and column datatypes):
npArray2 = np.genfromtxt(fileName,
delimiter="|",
skip_header=1,
dtype=(ndtypes),
names=ndnames,
usecols=np.arange(0,137)
)
This works and the resulting variable type is "void7520" with size (5543,). But this is really a 1D array of 5543 rows, where each element holds a sub-array that has 137 elements. I want to convert this into a normal numpy array of 5543 rows and 137 columns. How can this be done?
I have tried the following (using Pandas):
pdArray = pd.read_csv(fileName,
sep=ndelimiter,
index_col=False,
skiprows=1,
names=ndnames
)
npArray = pd.DataFrame.as_matrix(pdArray)
But, the resulting npArray is type Object with size (5543,137) which, at first, looks promising. But, because it's type Object, there are other functions that can't be performed on it. Can this Object array be converted into a normal numpy array?
Edit:
ndtypes look like...
[int,int,...,int,'|U50',int,...,int,'|U50',int,...,int]
That is, 135 number fields with two string-type fields in the middle somewhere.
npArray2 is a 1d structured array, 5543 elements and 137 fields.
What does npArray2.dtype look like, or equivalently what is ndtypes, because the dtype is built from the types and names that you provided. "void7520" is a way of identifying a record of this array, but tells us little except the size (in bytes?).
If all fields of the dtype are numeric, even better yet if they are all the same numeric dtype (int, float), then it is fairly easy to convert it to a 2d array with 137 columns (2nd dim). astype and view can be used.
(edit - it has both number and string fields - you can't convert it to a 2d array of numbers; it could be an array of strings, but you can't do numeric math on strings.)
But if the dtypes are mixed then you can't convert it. All elements of the 2d array have be the same dtype. You have to use the structured array approach if you want mixed types. (well there is the dtype=object, but let's not go there).
Actually pandas is going the object route. Evidently it thinks the only way to make an array from this data is to let each element be its own type. And the math of object arrays is severely limited. They are, in effect a glorified, or debased, list.
I am quite new to numpy and python in general. I am getting a dimension mismatch error when I try to append values even though I have made sure that both arrays have the same dimension. Also another question I have is why does numpy create a single dimensional array when reading in data from a tab delimited text file.
import numpy as np
names = ["Angle", "RX_Power", "Frequency"]
data = np.array([0,0,0],float) #experimental
data = np.genfromtxt("rx_power_mode 0.txt", dtype=float, delimiter='\t', names = names, usecols=[0,1,2], skip_header=1)
freq_177 = np.zeros(shape=(data.shape))
print(freq_177.shape) #outputs(315,)
for i in range(len(data)):
if data[i][2] == 177:
#np.concatenate(freq_177,data[i]) has same issue
np.append(freq_177,data[i],0)
The output I am getting is
all the input arrays must have same number of dimensions
Annotated code:
import numpy as np
names = ["Angle", "RX_Power", "Frequency"]
You don't need to 'initialize' an array - unless you are going to assign values to individual elements.
data = np.array([0,0,0],float) #experimental
This data assignment completely overwrites the previous one.
data = np.genfromtxt("rx_power_mode 0.txt", dtype=float, delimiter='\t', names = names, usecols=[0,1,2], skip_header=1)
Look at data at this point. What is data.shape? What is data.dtype? Print it, or at least some elements. With names I'm guessing that this is a 1d array, with a 3 field dtype. It's not a 2d array, though, with all floats it could transformed/view as such.
Why are you making a 1d array of zeros?
freq_177 = np.zeros(shape=(data.shape))
print(freq_177.shape) #outputs(315,)
With a structured array like data, the preferred way to index a given element is by field name and row number, eg. data['frequency'][i]`. Play with that.
np.append is not the same as the list append. It returns a value; it does not change freq_177 in place. Same for concatenate. I recommend staying away from np.append. It's too easy to use it in the wrong way and place.
for i in range(len(data)):
if data[i][2] == 177:
#np.concatenate(freq_177,data[i]) has same issue
np.append(freq_177,data[i],0)
It looks like you want to collect in freq_177 all the terms of the data array for which the 'frequency' field is 177.
I = data['frequency'].astype(int)==177
freq_177 = data[I]
I have used astype(int) because the == test with floats is uncertain. It is best used with integers.
I is a boolean mask, true where the values match; data[I] then is the corresponding elements of data. The dtype will match that of data, that is, it will have 3 fields. You can't append or concatenate it to an array of float zeros (your original freq_177).
If you must iterate and collect values, I suggest using list append, e.g.
alist = []
for row in data:
if int(row['frequency'])==177:
alist.append(row)
freq177 = np.array(alist)
I don't think np.append is discussed much except in its own doc page and text. It comes up periodically in SO questions.
http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.append.html
Returns: append : ndarray
A copy of arr with values appended to axis. Note that append does not occur in-place: a new array is allocated and filled.
See also help(np.append) in an interpreter shell.
For genfromtxt - it too has docs, and lots of SO discussion. But to understand what it returned in this case, you need to also read about structured arrays and compound dtype. (add links?)
Try loading the data with:
data = np.genfromtxt("rx_power_mode 0.txt", dtype=float, delimiter='\t', usecols=[0,1,2], skip_header=1)
Since you are skipping the header line, and just using columns with floats, data should be a 2d array with 3 columns, (N, 3). In that case you could access the 'frequency' values with data[:,2]
I = int(data[:,2])==177
freq_177 = data[I,:]
freq_177 is now be a 3 column array - with a subset of the data rows.
I have a text file with 93 columns and 1699 rows that I have imported into Python. The first three columns do not contain data that is necessary for what I'm currently trying to do. Within each column, I need to divide each element (aka row) in the column by all of the other elements (rows) in that same column. The result I want is an array of 90 elements where each of 1699 elements has 1699 elements.
A more detailed description of what I'm attempting: I begin with Column3. At Column3, Row1 is to be divided by all the other rows (including the value in Row1) within Column3. That will give Row1 1699 calculations. Then the same process is done for Row2 and so on until Row1699. This gives Column3 1699x1699 calculations. When the calculations of all of the rows in Column 3 have completed, then the program moves on to do the same thing in Column 4 for all of the rows. This is done for all 90 columns which means that for the end result, I should have 90x1699x1699 calculations.
My code as it currently is is:
import numpy as np
from glob import glob
fnames = glob("NIR_data.txt")
arrays = np.array([np.loadtxt(f, skiprows=1) for f in fnames])
NIR_values = np.concatenate(arrays)
NIR_band = NIR_values.T
C_values = []
for i in range(3,len(NIR_band)):
for j in range(0,len(NIR_band[3])):
loop_list = NIR_band[i][j]/NIR_band[i,:]
C_values.append(loop_list)
What it produces is an array of 1699x1699 dimension. Each individual array is the results from the Row calculations. Another complaint is that the code takes ages to run. So, I have two questions, is it possible to create the type of array I'd like to work with? And, is there a faster way of coding this calculation?
Dividing each of the numbers in a given column by each of the other values in the same column can be accomplished in one operation as follows.
result = a[:, numpy.newaxis, :] / a[numpy.newaxis, :, :]
Because looping over the elements happens in the optimized binary depths of numpy, this is as fast as Python is ever going to get for this operation.
If a.shape was [1699,90] to begin with, then the result will have shape [1699,1699,90]. Assuming dtype=float64, that means you will need nearly 2 GB of memory available to store the result.
First let's focus on the load:
arrays = np.array([np.loadtxt(f, skiprows=1) for f in fnames])
NIR_values = np.concatenate(arrays)
Your text talks about loading a file, and manipulating it. But this clip loads multple files and joins them.
My first change is to collect the arrays in a list, not another array
alist = [np.loadtxt(f, skiprows=1) for f in fnames]
If you want to skip some columns, look at using the usecols parameter. That may save you work later.
The elements of alist will now be 2d arrays (of floats). If they are matching sizes (N,M), they can be joined in various ways. If there are n files, then
arrays = np.array(alist) # (n,N,M) array
arrays = np.concatenate(alist, axis=0) # (n*N, M) array
# similarly for axis=1
Your code does the same, but potentially confuses steps:
In [566]: arrays = np.array([np.ones((3,4)) for i in range(5)])
In [567]: arrays.shape
Out[567]: (5, 3, 4) # (n,N,M) array
In [568]: NIR_values = np.concatenate(arrays)
In [569]: NIR_values.shape
Out[569]: (15, 4) # (n*N, M) array
NIR_band is now (4,15), and it's len() is the .shape[0], the size of the 1st dimension.len(NIR_band[3])isshape[1]`, the size of the 2nd dimension.
You could skip the columns of NIR_values with NIR_values[:,3:].
I get lost in the rest of text description.
The NIR_band[i][j]/NIR_band[i,:], I would rewrite as NIR_band[i,j]/NIR_band[i,:]. What's the purpose of that?
As for you subject line, Storing multiple arrays within multiple arrays within an array - that sounds like making a 3 or 4d array. arrays is 3d, NIR_valus is 2d.
Creating a (90,1699,1699) from a (93,1699) will probably involve (without iteration) a calculation analogous to:
In [574]: X = np.arange(13*4).reshape(13,4)
In [575]: X.shape
Out[575]: (13, 4)
In [576]: (X[3:,:,None]+X[3:,None,:]).shape
Out[576]: (10, 4, 4)
The last dimension is expanded with None (np.newaxis), and 2 versions broadcasted against each other. np.outer does the multiplication of this calculation.
I'm trying to add column names to a numpy ndarray, then select columns by their names. But it doesn't work. I can't tell if the problem occurs when I add the names, or later when I try to call them.
Here's my code.
data = np.genfromtxt(csv_file, delimiter=',', dtype=np.float, skip_header=1)
#Add headers
csv_names = [ s.strip('"') for s in file(csv_file,'r').readline().strip().split(',')]
data = data.astype(np.dtype( [(n, 'float64') for n in csv_names] ))
Dimension-based diagnostics match what I expect:
print len(csv_names)
>> 108
print data.shape
>> (1652, 108)
"print data.dtype.names" also returns the expected output.
But when I start calling columns by their field names, screwy things happen. The "column" is still an array with 108 columns...
print data["EDUC"].shape
>> (1652, 108)
... and it appears to contain more missing values than there are rows in the data set.
print np.sum(np.isnan(data["EDUC"]))
>> 27976
Any idea what's going wrong here? Adding headers should be a trivial operation, but I've been fighting this bug for hours. Help!
The problem is that you are thinking in terms of spreadsheet-like arrays, whereas NumPy does use different concepts.
Here is what you must know about NumPy:
NumPy arrays only contain elements of a single type.
If you need spreadsheet-like "columns", this type must be some tuple-like type. Such arrays are called Structured Arrays, because their elements are structures (i.e. tuples).
In your case, NumPy would thus take your 2-dimensional regular array and produce a one-dimensional array whose type is a 108-element tuple (the spreadsheet array that you are thinking of is 2-dimensional).
These choices were probably made for efficiency reasons: all the elements of an array have the same type and therefore have the same size: they can be accessed, at a low-level, very simply and quickly.
Now, as user545424 showed, there is a simple NumPy answer to what you want to do (genfromtxt() accepts a names argument with column names).
If you want to convert your array from a regular NumPy ndarray to a structured array, you can do:
data.view(dtype=[(n, 'float64') for n in csv_names]).reshape(len(data))
(you were close: you used astype() instead of view()).
You can also check the answers to quite a few Stackoverflow questions, including Converting a 2D numpy array to a structured array and how to convert regular numpy array to record array?.
Unfortunately, I don't know what is going on when you try to add the field names, but I do know that you can build the array you want directly from the file via
data = np.genfromtxt(csv_file, delimiter=',', names=True)
EDIT:
It seems like adding field names only works when the input is a list of tuples:
data = np.array(map(tuple,data), [(n, 'float64') for n in csv_names])