ValueError: Cannot convert [] to Excel - python

I'm trying to run a program that connects to a postrgres database, runs a query and saves this result in an excel spreadsheet.
The problem I'm facing is at the time I run the query. The result of the 'memberof' column has an empty {}.
rolname | rolsuper | rolinherit | rolcreaterole | rolcreatedb | rolcanlogin | rolconnlimit | rolvaliduntil | memberof | rolreplication | rolbypassrls
------------+----------+------------+---------------+-------------+-------------+--------------+---------------+----------+----------------+--------------
joe | f | t | f | f | t | -1 | | {} | f | f
postgres | t | t | t | t | t | -1 | | {} | t | t
test_joe | f | t | f | f | t | -1 | | {} | f | f
Because of this, my code returns this error output.
Traceback (most recent call last):
File "/path/to/file.py", line 55, in <module>
GetPostgresqlData()
File "/path/to/file.py", line 52, in GetPostgresqlData
ws.append(row)
File "/path/to/.local/lib/python3.9/site-packages/openpyxl/worksheet/worksheet.py", line 665, in append
cell = Cell(self, row=row_idx, column=col_idx, value=content)
File "/path/to/.local/lib/python3.9/site-packages/openpyxl/cell/cell.py", line 116, in __init__
self.value = value
File "/path/to/.local/lib/python3.9/site-packages/openpyxl/cell/cell.py", line 215, in value
self._bind_value(value)
File "/path/to/.local/lib/python3.9/site-packages/openpyxl/cell/cell.py", line 184, in _bind_value
raise ValueError("Cannot convert {0!r} to Excel".format(value))
ValueError: Cannot convert [] to Excel
What should I do so that the result of this column and the others is written in the excel spreadsheet?
Here is the query I'm running against my testing database.
SELECT r.rolname, r.rolsuper, r.rolinherit,
r.rolcreaterole, r.rolcreatedb, r.rolcanlogin,
r.rolconnlimit, r.rolvaliduntil,
ARRAY(SELECT b.rolname
FROM pg_catalog.pg_auth_members m
JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)
WHERE m.member = r.oid) as memberof
, r.rolreplication
, r.rolbypassrls
FROM pg_catalog.pg_roles r
WHERE r.rolname !~ '^pg_'
ORDER BY 1;
Here is my code.
def GetPostgresqlData():
wb = Workbook()
wb.remove(wb['Sheet'])
ws = wb.create_sheet(0)
ws.title = 'Titulo 01'
hosts = open('serverlist.csv', 'r')
hosts = hosts.readlines()[1:]
for i in range(len(filequery[1:])):
with open(filequery[i+1], 'r') as postgresql_query:
query = postgresql_query.read()
for row in hosts:
command_Array = row.strip('\n').split(',')
db_host = command_Array[0]
db_port = command_Array[1]
db_name = command_Array[2]
db_user = command_Array[3]
db_password = command_Array[4]
postgresql_connection = psycopg2.connect(user=db_user,
password=db_password,
host=db_host,
port=db_port,
database=db_name)
cursor = postgresql_connection.cursor()
cursor.execute(query)
colnames = [desc[0] for desc in cursor.description]
records = cursor.fetchall()
ws.append(colnames)
for row in records:
ws.append(row)
wb.save("postgresql.xlsx")
GetPostgresqlData()

The problem I was facing that was pointed out by Charlie Chuck in the first comment is that I can't save a list in a single cell. So I solved the problem by converting the array to string as shown below.
SELECT r.rolname, r.rolsuper, r.rolinherit,
r.rolcreaterole, r.rolcreatedb, r.rolcanlogin,
r.rolconnlimit, r.rolvaliduntil,
array_to_string(ARRAY(SELECT b.rolname
FROM pg_catalog.pg_auth_members m
JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)
WHERE m.member = r.oid), ',', '*') as memberof
, r.rolreplication
, r.rolbypassrls
FROM pg_catalog.pg_roles r
WHERE r.rolname !~ '^pg_'
ORDER BY 1

Related

Using Pyspark how to convert plain text to csv file

When I created a hive table, the data is as follows.
data file
<__name__>abc
<__code__>1
<__value__>1234
<__name__>abcdef
<__code__>2
<__value__>12345
<__name__>abcdef
<__code__>2
<__value__>12345
1234156321
<__name__>abcdef
<__code__>2
<__value__>12345
...
Can I create a table right away without converting the file?
It's a plain text file, three columns are repeated.
How to convert dataframe? or csv file?
I want
| name | code | value
| abc | 1 | 1234
| abcdef | 2 | 12345
...
or
abc,1,1234
abcdef,2,12345
...
I solved my problem like this.
data = spark.read.text(path)
rows = data.rdd.zipWithIndex().map(lambda x: Row(x[0].value, int(x[1]/3)))
schema = StructType() \
.add("col1",StringType(), False) \
.add("record_pos",IntegerType(), False)
df = spark.createDataFrame(rows, schema)
df1 = df.withColumn("key", regexp_replace(split(df["col1"], '__>')[0], '<|__', '')) \
.withColumn("value", regexp_replace(regexp_replace(split(df["col1"], '__>')[1], '\n', '<NL>'), '\t', '<TAB>'))
dataframe = df1.groupBy("record_pos").pivot("key").agg(first("value")).drop("record_pos")
dataframe.show()
val path = "file:///C:/stackqustions/data/stackq5.csv"
val data = sc.textFile(path)
import spark.implicits._
val rdd = data.zipWithIndex.map {
case (records, index) => Row(records, index / 3)
}
val schema = new StructType().add("col1", StringType, false).add("record_pos", LongType, false)
val df = spark.createDataFrame(rdd, schema)
val df1 = df
.withColumn("key", regexp_replace(split($"col1", ">")(0), "<|__", ""))
.withColumn("value", split($"col1", ">")(1)).drop("col1")
df1.groupBy("record_pos").pivot("key").agg(first($"value")).drop("record_pos").show
result:
+----+------+-----+
|code| name|value|
+----+------+-----+
| 1| abc| 1234|
| 2|abcdef|12345|
| 2|abcdef|12345|
| 2|abcdef|12345|
+----+------+-----+

Creating a formatted table from a list of tuples

I have a list of data that looks something like this:
[
(
1,
u'python -c \'print("ok")\'',
u'data',
u'python'
), (
2,
u'python -c \'print("this is some data")\'',
u'data',
u'python'
)
]
This data is taken out of a database and displayed as this, and is continuously growing. What I would like to do is display the data like this:
Language | Type | Payload
-------------------------------
python | data | python -c 'print("ok")'
python | data | python -c 'print("this is some data")'
I have a function that kinda does the same thing, but it's not exactly as expected:
def print_table(data, cols, width):
n, r = divmod(len(data), cols)
pattern = "{{:{}}}".format(width)
line = "\n".join(pattern * cols for _ in range(n))
last_line = pattern * r
print(line.format(*data))
print(last_line.format(*data[n*cols]))
How can I get the output of my data to look as wanted? From the answers it's possible with pandas but I would also like a way to do it without installing external modules
Analyze the data for its max-width and use string formatting - some 'creative' formatting later:
data = [
(
1,
u'python -c \'print("ok")\'',
u'data',
u'python'
), (
2,
u'python -c \'print("this is some data")\'',
u'data',
u'python'
)
]
def print_table(data):
widths = {0:0, 3:len("Language"),2:len("Type"),1:len("Payload")}
for k in data:
for i,d in enumerate(k):
widths[i] = max(widths[i],len(str(d)))
# print(widths)
lan, typ, pay = ["Language","Type","Payload"]
print(f"{lan:<{widths[3]}} | {typ:<{widths[2]}} | {pay:<{widths[1]}}")
# adjust by 10 for ' | ' twice
print("-" * (widths[1]+widths[2]+widths[3]+10))
for k in data:
_, pay, typ, lan = k
print(f"{lan:<{widths[3]}} | {typ:<{widths[2]}} | {pay:<{widths[1]}}")
Output:
Language | Type | Payload
------------------------------------------------------------
python | data | python -c 'print("ok")'
python | data | python -c 'print("this is some data")'
string format mini language
Equivalent Python 2.7 code:
# w == widths - would break 79 chars/line else wise
def print_table(data):
w = {0:0, 3:len("Language"),2:len("Type"),1:len("Payload")}
for k in data:
for i,d in enumerate(k):
w[i] = max(w[i],len(str(d)))
lan, typ, pay = ["Language","Type","Payload"]
print "{:<{}} | {:<{}} | {:<{}}".format(lan, w[3], typ, w[2], pay, w[1])
print "-" * (w[1]+w[2]+w[3]+10)
for k in data:
_, pay, typ, lan = k
print "{:<{}} | {:<{}} | {:<{}}".format(lan, w[3], typ, w[2], pay, w[1])
Solution that handles any number of columns:
from operator import itemgetter
data = [
('ID', 'Payload', 'Type', 'Language'),
(1, u'python -c \'print("ok")\'', u'data', u'python'),
(2, u'python -c \'print("this is some data")\'', u'data', u'python')
]
def print_table(data):
lengths = [
[len(str(x)) for x in row]
for row in data
]
max_lengths = [
max(map(itemgetter(x), lengths))
for x in range(0, len(data[0]))
]
format_str = ''.join(map(lambda x: '%%-%ss | ' % x, max_lengths))
print(format_str % data[0])
print('-' * (sum(max_lengths) + len(max_lengths) * 3 - 1))
for x in data[1:]:
print(format_str % x)
print_table(data)
Output:
$ python table.py
ID | Payload | Type | Language |
---------------------------------------------------------------
1 | python -c 'print("ok")' | data | python |
2 | python -c 'print("this is some data")' | data | python |
You can use pandas for that:
import pandas as pd
data = pd.DataFrame(a, columns=['id','Payload', 'type', 'Language'])
print(data)
gives you:
id Payload type Language
0 1 python -c 'print("ok")' data python
1 2 python -c 'print("this is some data")' data python

How to split a comma-delimited list in IronPython (Spotfire)?

I have a existing data table with two columns, one is a ID and one is a list of IDs, separated by comma.
For example
ID | List
---------
1 | 1, 4, 5
3 | 2, 12, 1
I would like to split the column List so that I have a table like this:
ID | List
---------
1 | 1
1 | 4
1 | 5
3 | 2
3 | 12
3 | 1
I figured this out now:
tablename='Querysummary Data'
table=Document.Data.Tables[tablename]
topiccolname='TOPIC_ID'
topiccol=table.Columns[topiccolname]
topiccursor=DataValueCursor.Create[str](topiccol)
docscolname='DOC_IDS'
doccol=table.Columns[docscolname]
doccursor=DataValueCursor.Create[str](doccol)
myPanel = Document.ActivePageReference.FilterPanel
idxSet = myPanel.FilteringSchemeReference.FilteringSelectionReference.GetSelection(table).AsIndexSet()
keys=dict()
topdoc=dict()
for row in table.GetRows(idxSet,topiccursor,doccursor):
keys[topiccursor.CurrentValue]=doccursor.CurrentValue
for key in keys:
str = keys[key].split(",")
for i in str:
topdoc[key]=i
print key + " " +i
now I can print the topic id with the corresponding id.
How can I create a new data table in Spotfire using this dict()?
I solved it myself finally..maybe there is some better code but it works:
tablename='Querysummary Data'
table=Document.Data.Tables[tablename]
topiccolname='TOPIC_ID'
topiccol=table.Columns[topiccolname]
topiccursor=DataValueCursor.Create[str](topiccol)
docscolname='DOC_IDS'
doccol=table.Columns[docscolname]
doccursor=DataValueCursor.Create[str](doccol)
myPanel = Document.ActivePageReference.FilterPanel
idxSet = myPanel.FilteringSchemeReference.FilteringSelectionReference.GetSelection(table).AsIndexSet()
# build a string representing the data in tab-delimited text format
textData = "TOPIC_ID;DOC_IDS\r\n"
keys=dict()
topdoc=dict()
for row in table.GetRows(idxSet,topiccursor,doccursor):
keys[topiccursor.CurrentValue]=doccursor.CurrentValue
for key in keys:
str = keys[key].split(",")
for i in str:
textData += key + ";" + i + "\r\n"
dataSet = DataSet()
dataTable = DataTable("DOCIDS")
dataTable.Columns.Add("TOPIC_ID", System.String)
dataTable.Columns.Add("DOC_IDS", System.String)
dataSet.Tables.Add(dataTable)
# make a stream from the string
stream = MemoryStream()
writer = StreamWriter(stream)
writer.Write(textData)
writer.Flush()
stream.Seek(0, SeekOrigin.Begin)
# set up the text data reader
readerSettings = TextDataReaderSettings()
readerSettings.Separator = ";"
readerSettings.AddColumnNameRow(0)
readerSettings.SetDataType(0, DataType.String)
readerSettings.SetDataType(1, DataType.String)
readerSettings.SetDataType(2, DataType.String)
# create a data source to read in the stream
textDataSource = TextFileDataSource(stream, readerSettings)
# add the data into a Data Table in Spotfire
if Document.Data.Tables.Contains("Querysummary Mapping"):
Document.Data.Tables["Querysummary Mapping"].ReplaceData(textDataSource)
else:
newTable = Document.Data.Tables.Add("Querysummary Mapping", textDataSource)
tableSettings = DataTableSaveSettings (newTable, False, False)
Document.Data.SaveSettings.DataTableSettings.Add(tableSettings)

mysql-python fetchrow brings: TypeError: 'long' object is not callable

I got a table:
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| Benutzer | varchar(50) | YES | | NULL | |
| Confirmed | tinyint(1) | YES | | NULL | |
+-----------+-------------+------+-----+---------+-------+
with one entry!
If I execute on the mysql console in a shell:
select Benutzer from UserConfirm where Benutzer = '\{\'gid\'\:\ \'tamer\'\,\ \'uid\'\:\ \'tamer\'\}'
it works!
At mysql-python there comes the errormessage:
TypeError: 'long' object is not callable
What did I make wrong?!
Here is my python code:
cursor = self.__db.cursor().execute('select * from UserConfirm where Benutzer = \'' + ds + '\'')
return cursor().fetchrow()
For any advice, I would kindly thank you.
The problem is that you are not storing the cursor object, just the return value of execute, which is not the cursor, it should be:
cursor = self.__db.cursor()
cursor.execute('select * from UserConfirm where Benutzer = \'' + ds + '\'')
return cursor.fetchone()
Note that I assume your line cursor().fetchrow() is a typo and you meant cursor.fetchrow().
Not knowing mysql-python myself, from the error I can assume cursor.execute is returning the number of rows or an error code, not the cursor itself.
Try the following instead:
cursor = self.__db.cursor()
cursor.execute('select * from UserConfirm where Benutzer = \'' + ds + '\'')
return cursor.fetchrow()

How can I replace blank entries in a text table with 0 in Python?

I have tables which looks like this:
text = """
ID = 1234
Hello World 135,343 117,668 81,228
Another line of text (30,632) (48,063)
More text 0 11,205 0
Even more text 1,447 681
ID = 18372
Another table 35,323 38,302 909,381
Another line with text 13 15
More text here 7 0
Even more text here 7,011 1,447 681
"""
Is there a way to replace the "blank" entries in each table with 0? I am trying to set delimiters between the entries, but using the following code can't deal with blank spots in the tables:
for line in text.splitlines():
if 'ID' not in line:
line1 = line.split()
line = '|'.join((' '.join(line1[:-3]), '|'.join(line1[-3:])))
print line
else:
print line
The output is:
ID = 1234
|
Hello World|135,343|117,668|81,228
Another line of|text|(30,632)|(48,063)
More text|0|11,205|0
Even more|text|1,447|681
|
ID = 18372
|
Another table|35,323|38,302|909,381
Another line with|text|13|15
More text|here|7|0
Even more text here|7,011|1,447|681
As you can see, the first problem shows up on the second line of the first table. The word 'text' is considered the first column. Any way to fix this in Python to replace blank entries with 0?
Here is a function for finding columns in a bunch of lines. The second argument pat defines what a column is, and can be any regex.
import itertools as it
import re
def find_columns(lines, pat = r' '):
'''
Usage:
widths = find_columns(lines)
for line in lines:
if not line: continue
vals = [ line[widths[i]:widths[i+1]].strip() for i in range(len(widths)-1) ]
'''
widths = []
maxlen = max(len(line) for line in lines)
for line in lines:
line = ''.join([line, ' '*(maxlen-len(line))])
candidates = []
for match in re.finditer(pat, line):
candidates.extend(range(match.start(), match.end()+1))
widths.append(set(candidates))
widths = sorted(set.intersection(*widths))
diffs = [widths[i+1]-widths[i] for i in range(len(widths)-1)]
diffs = [None]+diffs
widths = [w for d, w in zip(diffs, widths) if d != 1]
if widths[0] != 0: widths = [0]+widths
return widths
def report(text):
for key, group in it.groupby(text.splitlines(), lambda line:line.startswith('ID')):
lines = list(group)
if key:
print('\n'.join(lines))
else:
# r' (?![a-zA-Z])' defines a column to be any whitespace
# not followed by alphabetic characters.
widths = find_columns(lines, pat = r'\s(?![a-zA-Z])')
for line in lines:
if not line: continue
vals = [ line[widths[i]:widths[i+1]] for i in range(len(widths)-1) ]
vals = [v if v.strip() else v[1:]+'0' for v in vals]
print('|'.join(vals))
text = """\
ID = 1234
Hello World 135,343 117,668 81,228
Another line of text (30,632) (48,063)
More text 0 11,205 0
Even more text 1,447 681
ID = 18372
Another table 35,323 38,302 909,381
Another line with text 13 15
More text here 7 0
Even more text here 7,011 1,447 681
"""
report(text)
yields
ID = 1234
Hello World | 135,343| 117,668| 81,228
Another line of text| (30,632)| 0| (48,063)
More text | 0 | 11,205| 0
Even more text | 0| 1,447 | 681
ID = 18372
Another table | 35,323| 38,302| 909,381
Another line with text| 13 | 15|0
More text here | 0| 7 | 0
Even more text here | 7,011| 1,447| 681

Categories

Resources