Python pandas pass avarieble from inner loop to outer loop - python

I have a python code as below but I cant pass the variable of inner loop to outer loop.
Every time the inner loop breaks, the value of variable "x" reset to initial value.
import pandas as pd
import csv
user_list = pd.read_csv(r'C:\Users\Administrator\Desktop\user_list.csv')
domain_list = pd.read_csv(r'C:\Users\Administrator\Desktop\domainlist.csv')
x=1
y=0
for y in user_list.index:
print(user_list.iloc[y,0])
for x in domain_list.index:
print(domain_list.iloc[x,0])
x=x+1
if(x % 10 == 0):
break
print("out of loop, value of x is "+str(x))
below is my csv files
User_list
user1,pw1
user2,pw2
user3,pw3
Domain_list
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
techcrunch.com
thenextweb.com
tomshardware.com
roblox.com
discord.com
office.com
tiktok.com
wikipedia.org
baidu.com
samsung.com
bilibili.com
duckduckgo.com
Desired output is as below
After User1 is printed, 1-10 website names are printed and then User2 is printed and 11-20 websites are printed
User1
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
User2
techcrunch.com
thenextweb.com
tomshardware.com
roblox.com
discord.com
office.com
tiktok.com
wikipedia.org
baidu.com
samsung.com
Current output is as below
user1
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
user2
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
user3
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
out of loop, value of x is 10

This is the correct behaviour, since once you go back to outer loop again, the inner loop starts from the begging.
You can do something like this, use another variable z and use it.
z=0
for y in user_list.index:
print(user_list.iloc[y,0])
for x in domain_list.index:
if len(domain_list)==z:
break
print(domain_list.iloc[z,0])
z=z+1
if(z % 10 == 0):
break
Output:
user1
burton.com
amazon.com
gizmodo.com
theverge.com
venturebeat.com
digitaltrends.com
mashable.com
theinformation.com
engadget.com
arstechnica.com
user2
techcrunch.com
thenextweb.com
tomshardware.com
roblox.com
discord.com
office.com
tiktok.com
wikipedia.org
baidu.com
samsung.com
user3
bilibili.com
duckduckgo.com
out of loop, value of x is 22

Related

Python - XML file to Pandas Dataframe [duplicate]

This question already has answers here:
How to convert an XML file to nice pandas dataframe?
(5 answers)
Closed 1 year ago.
I'm fairly new to python and am hoping to get some help transforming an XML file into Pandas Dataframe. I have searched other resources but am still stuck. I'm looking to get all the fields in between tag into a table. Any help is greatly appreciated! Thank you.
Below is the code I tried but it not working properly.
import xml.etree.ElementTree as ET
import pandas as pd
xml_data = open('5249009-08-34-59-126029.xml', 'r').read()
root = ET.XML(xml_data)
data = []
cols = []
for i, child in enumerate(root):
data.append([subchild.text for subchild in child])
cols.append(child.tag)
df = pd.DataFrame(data).T
df.columns = cols
print(df)
Below is sample input data"
<?xml version="1.0"?>
-<RECORDING>
<IDENT>0</IDENT>
<DEVICEID>133242232</DEVICEID>
<DEVICEALIAS>52232009</DEVICEALIAS>
<GROUP>1823481655</GROUP>
<GATE>1011655</GATE>
<ANI>7777777777</ANI>
<DNIS>777777777</DNIS>
<USER1>00:07:53.2322691,00:03:21.34232761</USER1>
<USER2>text</USER2>
<USER3/>
<USER4/>
<USER5>34fc0a8d-d5632c9b1</USER5>
<USER6>000dfsdf98701596638094</USER6>
<USER7>97</USER7>
<USER8>00701596638094</USER8>
<USER9>10155</USER9>
<USER10/>
<USER11/>
<USER12/>
<USER13>Text</USER13>
<USER14>4</USER14>
<USER15>10</USER15>
<CALLSEGMENTID/>
<CALLID>9870</CALLID>
<FILENAME>\\folderpath\folderpath\folderpath\folderpath\2020\Aug\05\5249009\52343109-234234-34-59-1234234029</FILENAME>
<DURATION>201</DURATION>
<STARTYEAR>2020</STARTYEAR>
<STARTMONTH>08</STARTMONTH>
<STARTMONTHNAME>August</STARTMONTHNAME>
<STARTDAY>05</STARTDAY>
<STARTDAYNAME>Wednesday</STARTDAYNAME>
<STARTHOUR>08</STARTHOUR>
<STARTMINUTE>34</STARTMINUTE>
<STARTSECOND>59</STARTSECOND>
<PRIORITY>50</PRIORITY>
<RECORDINGTYPE>S</RECORDINGTYPE>
<CALLDIRECTION>I</CALLDIRECTION>
<SCREENCAPTURE>7</SCREENCAPTURE>
<KEEPCALLFORDAYS>90</KEEPCALLFORDAYS>
<BLACKOUTREMOTEAUDIO>false</BLACKOUTREMOTEAUDIO>
<BLACKOUTS/>
</RECORDING>
One possible solution how to parse the file:
import pandas as pd
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("your_file.xml", "r"), "xml")
d = {}
for tag in soup.RECORDING.find_all(recursive=False):
d[tag.name] = tag.get_text(strip=True)
df = pd.DataFrame([d])
print(df)
Prints:
IDENT DEVICEID DEVICEALIAS GROUP GATE ANI DNIS USER1 USER2 USER3 USER4 USER5 USER6 USER7 USER8 USER9 USER10 USER11 USER12 USER13 USER14 USER15 CALLSEGMENTID CALLID FILENAME DURATION STARTYEAR STARTMONTH STARTMONTHNAME STARTDAY STARTDAYNAME STARTHOUR STARTMINUTE STARTSECOND PRIORITY RECORDINGTYPE CALLDIRECTION SCREENCAPTURE KEEPCALLFORDAYS BLACKOUTREMOTEAUDIO BLACKOUTS
0 0 133242232 52232009 1823481655 1011655 7777777777 777777777 00:07:53.2322691,00:03:21.34232761 text 34fc0a8d-d5632c9b1 000dfsdf98701596638094 97 00701596638094 10155 Text 4 10 9870 \\folderpath\folderpath\folderpath\folderpath\... 201 2020 08 August 05 Wednesday 08 34 59 50 S I 7 90 false

Printing out scraped data with python selenium

Here is the way I came up with to print out scraped data:
pool_to_search_for_loads = driver.find_element(By.XPATH, '//*[#id="searchResults"]/div[5]/div')
loads_contact = pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'contact')
loads_origin = pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'origin')
loads_dest = pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'dest')
def parse_printer(info1, info2, info3):
count = 0
while count < len(info1):
print(info1[count].text, ' from ', info2[count].text, ' to ', info3[count].text)
count += 1
parse_printer(loads_contact, loads_origin, loads_dest)
This gives me such output:
(800) 999-0101 from Hernando, FL to Port Huron, MI
(800) 999-0101 from Albany, GA to Dayton, OH
(800) 999-0101 from Valdosta, GA to Cincinnati, OH
(800) 999-0101 from Tallahassee, FL to Indianapolis, IN
(800) 999-0101 from Macon, GA to Lexington, KY
Writing a function for such seems to be an overkill, is there a more elegant way to print out results?
Depending on whether you need to retain the loads_contact, loads_origin, and loads_dest variables for other usage. You could use list comprehension to extract the text.
loads_contact = [x.text for x in pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'contact')]
loads_origin = [x.text for x in pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'origin')]
loads_dest = [x.text for x in pool_to_search_for_loads.find_elements(By.CLASS_NAME, 'dest')]
Then you could zip those 3 into 1 list and then use the items in that 1 list (combined with f-string).
for item in zip(loads_contact, loads_origin, loads_dest):
print(f"{item[0]} from {item[1]} to {item[2]}")

python-ldap Modify phone number

i would like to change mobile phone numbers in an AD with a python(-ldap) script.
This is the Code I tried to use:
# import needed modules
import ldap
import ldap.modlist as modlist
# Open a connection
l = ldap.initialize("ldap://host/",trace_level=3)
# Bind/authenticate with a user with apropriate rights to add objects
l.simple_bind_s("user#domain","pw")
# The dn of our existing entry/object
dn="CN=common name,OU=Users,OU=x,DC=y,DC=z,DC=e"
# Some place-holders for old and new values
name = 'mobile'
nr1 = '+4712781271232'
nr2 = '+9812391282822'
old = {name:nr1}
new = {name:nr2}
# Convert place-holders for modify-operation using modlist-module
ldif = modlist.modifyModlist(old,new)
# Do the actual modification
l.modify_s(dn, ldif)
# Its nice to the server to disconnect and free resources when done
l.unbind_s()
Unfortunately i get the following error:
ldap.UNWILLING_TO_PERFORM: {'info': u'00000057: LdapErr:
DSID-0C090FC7, comment: Error in attribute conversion operation, data
0, v4563', 'desc': u'Server is unwilling to perform'}
I am able to delete the entry by leaving old empty, but when i try to set it, i get the following:
LDAPError - TYPE_OR_VALUE_EXISTS: {'info': u'00002083: AtrErr:
DSID-031519F7, #5:\n\t0: 00002083: DSID-031519F7, problem 1006
(ATT_OR_VALUE_EXISTS), data 0, Att 150029 (mobile):len 2\n\t1:
00002083: DSID-031519F7, problem 1006 (ATT_OR_VALUE_EXISTS), data 0,
Att 150029 (mobile):len 2\n\t2: 00002083: DSID-031519F7, problem 1006
(ATT_OR_VALUE_EXISTS), data 0, Att 150029 (mobile):len 2\n\t3:
00002083: DSID-031519F7, problem 1006 (ATT_OR_VALUE_EXISTS), data 0,
Att 150029 (mobile):len 2\n\t4: 00002083: DSID-031519F7, problem 1006
(ATT_OR_VALUE_EXISTS), data 0, Att 150029 (mobile):len 2\n', 'desc':
u'Type or value exists'}
Using the command line tool ldapmodify i was able to do those two:
dn:CN=common name,OU=Users,OU=x,DC=y,DC=z,DC=e
changetype: modify
add: mobile
mobile: +1 2345 6789
dn:CN=common name,OU=Users,OU=x,DC=y,DC=z,DC=e
changetype: modify
delete: mobile
mobile: +1 2345 6789
But unable to do this:
dn:CN=common name,OU=Users,OU=x,DC=y,DC=z,DC=e
changetype: modify
replace: mobile
mobile: +1 2345 6789
mobile: +4 567 89012345
Following error:
ldap_modify: Constraint violation (19)
additional info: 00002081: AtrErr: DSID-03151907, #1:
0: 00002081: DSID-03151907, problem 1005 (CONSTRAINT_ATT_TYPE), data 0, Att 150029 (mobile)
have been trying some time now and would really appreciate some help.
Nevermind the question. Replaced:
nr1 = '+4712781271232'
nr2 = '+9812391282822'
old = {name:nr1}
new = {name:nr2}
With:
old = {'mobile':["+4712781271232"]}
new = {'mobile':["+9812391282822"]}
Brackets do the trick ;)

Text to PDF Positioning Lines

I have a text file that i am reading and writing line by line into a PDF. The lines are out of position on the PDF because the FPDF library is left aligning all my lines. I am using the property set x so i can position each line to my liking. I am trying to reposition the headers until "RATE CODE CY" the would like all the data under the columns to come after. Then another header appears. I would like to align all the headers that come after the data. I know a for loop needs to be done to bring rest of the data...the issue is a header will come again and there is where i have to make the change with set_x property.
pdf = FPDF("L", "mm", "A4")
pdf.add_page()
pdf.set_font('arial', style='', size=10.0)
lines = file.readlines()
header8 = lines[7]
header8_1 = " ".join(lines[8].split()[:4])
header8_2 = " ".join(lines[8].split()[4:])
header9_1 = " ".join(lines[9].split()[:5])
header9_2 = " ".join(lines[9].split()[5:])
pdf.cell(ln=0, h=5.0, align='L', w=0, txt=header8_1, border=0)
pdf.set_x(125)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header8_2, border=0)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt=header9_1, border=0)
pdf.set_x(125)
pdfcell(ln=1, h=5.0, align='L', w=0, txt=header9_2, border=0)
Current PDF file:
READ SVC B MAXIMUM TOTAL DUE METER NO REMARKS
ACCOUNT # SERVICE ADDRESS CITY DATE DAY C KWH KWD AMOUNT
RATE CODE CY CUSTOMER NAME MAILING ADDRESS
----------------------------------------------------------------------------------------------------
11211-22222 12345 TEST HWY #86 TITUSVIL 10/12/19 29 C 1,444 189.01 ABC1234
GS-1 3 Home & ASSOC INC 1234 Miami HWY APT49
22222-33333 12345 TEST HWY #88 TITUSVIL 10/04/19 29 C 256 41.50 ABC1235
GS-1 3 DGN & ASSOC INC 1234 Miami HWY APT49
READ SVC B MAXIMUM TOTAL DUE METER NO REMARKS
ACCOUNT # SERVICE ADDRESS CITY DATE DAY C KWH KWD AMOUNT
RATE CODE CY CUSTOMER NAME MAILING ADDRESS
----------------------------------------------------------------------------------------------------
11211-22222 12345 TEST HWY #86 TITUSVIL 10/12/19 29 C 1,444 189.01 ABC1234
GS-1 3 Home & ASSOC INC 1234 Miami HWY APT49
22222-33333 12345 TEST HWY #88 TITUSVIL 10/04/19 29 C 256 41.50 ABC1235
GS-1 3 DGN & ASSOC INC 1234 Miami HWY APT49

For-loop does not iterate to next element in a file

I have a problem with my two for-loops in this code:
def batchm():
searchFile = sys.argv[1]
namesFile = sys.argv[2]
writeFile = sys.argv[3]
countDict = {}
with open(searchFile, "r") as nlcfile:
with open(namesFile, "r") as namesList:
with open(writeFile, "a") as wfile:
for name in namesList:
for line in nlcfile:
if name in line:
res = line.split("\t")
countValue = res[0]
countKey = res[-1]
countDict[countKey] = countValue
countDictMax = sorted(countDict, key = lambda x: x[1], reverse = True)
print(countDictMax)
The loop is iterating over this:
namesList:
Greene
Donald
Donald Duck
MacDonald
.
.
.
nlcfile:
123 1999–2000 Northampton Town F.C. season Northampton Town
5 John Simpson Kirkpatrick
167 File talk:NewYorkRangers1940s.png talk
234 Parshu Ram Sharma(Raj Comics) Parshuram Sharma
.
.
.
What I get looks like this:
['Lyn Greene\n', 'Rydbergia grandiflora (Torrey &amp; A. Gray in A. Gray) E. Greene\n', 'Tyler Greene\n', 'Ty Greene\n' ..... ]
and this list appears 48 times, which also happens to be the number of lines in namesList.
Desired output:
("string from namesList" -> "record with highest number in nlcfile")
Greene -> Ly Greene
Donald -> Donald Duck
.
.
.
I think that the two for-loops don't iterate the right way. But I have no clue, why.
Can anyone see, where the problem is?
Thank you very much!

Categories

Resources