How to modify a column in a SQLite3? - python

I have a list A has many columns.
The question is how to replace the value 1 in column y based on the previous value.
A = [ d x y z
0 1 2 5
1 2 1 9
2 8 1 2
3 3 40 7
4 6 1 7
5 4 30 3
6 8 40 8
7 9 1 10
8 6 1 4
9 10 10 7]
The expected answer should be :
A = [ d x y z
0 1 2 5
1 2 2 9
2 8 2 2
3 3 40 7
4 6 40 7
5 4 30 3
6 8 40 8
7 9 40 10
8 6 40 4
9 10 10 7]
Many thanks in advance...
Here is my code, and I am trying to modify column y and save it in the same table1.
import csv
import numpy as np
import numpy
import pandas as pd
conn = sqlite3.connect('data.db')
conn.text_factory = str
cur = conn.cursor()
A = cur.execute("SELECT * FROM table1")
with open('output_data1001.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(['d', 'x','y','z'])
writer.writerows(A)

This isn't the format that your list is in. Python doesn't read lists this way. There are a couple of ways to do this but they all require either thinking about your list in a different way or formatting this as something other than a list. If you want to keep it as a list, you can make it a list of lists:
A = [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[1, 2, 8, 3, 6, 4, 8, 9, 6, 10],
[2, 1, 1, 40, 1, 30, 40, 1, 1, 10],
[5, 9, 2, 7, 7, 3, 8, 10, 4, 7]]
Now, you can reference sub-lists by their index, and make any changes you want:
for i in range(len(A[2])):
if A[2][i]==1:
A[2][i]=A[2][i-1]
print(A[2])
>>>[2, 2, 2, 40, 40, 30, 40, 40, 40, 10]
You could also call the list an array, rather than a list:
import numpy
A = numpy.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[1, 2, 8, 3, 6, 4, 8, 9, 6, 10],
[2, 1, 1, 40, 1, 30, 40, 1, 1, 10],
[5, 9, 2, 7, 7, 3, 8, 10, 4, 7]])
for i in range(0,len(A[2])):
if A[2, i]==1:
A[2, i]=A[2, i-1]
print(A[2])
>>>[2, 2, 2, 40, 40, 30, 40, 40, 40, 10]
Or it could be a dictionary:
A = {"d":[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
"x":[1, 2, 8, 3, 6, 4, 8, 9, 6, 10],
"y":[2, 1, 1, 40, 1, 30, 40, 1, 1, 10],
"z":[5, 9, 2, 7, 7, 3, 8, 10, 4, 7]}
for i in range(len(A["y"])):
if A["y"][i]==1:
A["y"][i]=A["y"][i-1]
print(A["y"])
>>>[2, 2, 2, 40, 40, 30, 40, 40, 40, 10]
Python is a little looser with data structures than other languages so it is easy to get tripped up by them in the beginning, since python will let you do alot without being fully cognizant of what data type you are using, but, in general, you should always consider your data type and the syntax conventions behind it before attempting to structure your data in that way.

Python lists are not arrays. They don't have columns. You can make lists of lists, though, which make it possible to call things like "A[2][3]". Or in this case, just have a dictionary where the keys are "d", "x", "y", and "z", and the respective values of those are lists representing each column.
>>> A = {}
>>> A['d'] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> A['x'] = [1, 2, 8, 3, 6, 4, 8, 9, 6, 10]
>>> A['y'] = [2, 1, 1, 30, 1, 30, 40, 1, 1, 10]
>>> A['z'] = [5, 9, 2, 7, 7, 3, 8, 10, 4, 7]
>>> A['y']
[2, 1, 1, 30, 1, 30, 40, 1, 1, 10]
Then you can just write a loop to look through all of the 'y' values from the second position onward, and replace any ones with the previous value:
>>> for i in range(1,len(A['y'])):
... if A['y'][i] == 1:
... A['y'][i] = A['y'][i-1]
...
>>> A['y']
[2, 2, 2, 30, 30, 30, 40, 40, 40, 10]
EDIT: I see from your more recent edits that this isn't a list at all that you're working with, but what's known as a data frame from the Pandas module. Again, "lists" and "arrays" mean very specific forms in Python. But the same logic can work:
import pandas as pd
mydict = {'d': pd.Series([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
'x': pd.Series([1, 2, 8, 3, 6, 4, 8, 9, 6, 10]),
'y': pd.Series([2, 1, 1, 30, 1, 30, 40, 1, 1, 10]),
'z': pd.Series([5, 9, 2, 7, 7, 3, 8, 10, 4, 7])}
A = pd.DataFrame(mydict)
for row in range(A.shape[0]):
if A['y'][row] == 1:
A['y'][row] = A['y'][row-1]

Related

How can i create a list of 50 random numbers where the even numbers are twice the odd numbers?

can you help me with this problem please?
I have to create a list of 50 random numbers from 1 to 10 where the odd numbers frequency is aproximately twice the even number frequency.
Try two list, one with 33 random odd numbers and the other with 17 random even numbers, and then merge the two lists.
Solution as simplified
import random
odd=[item for item in range(1,11,2)]
even=[item for item in range(2,11,2)]
# twice chance 50/3 ~17 odd=33 even=17
random_even=[random.choice(even) for _ in range(17)]
random_odd=[random.choice(odd) for _ in range(33)]
random_number=random_even+random_odd # sum
random.shuffle(random_number) # messing up order
print("even",len(random_even),"odd",len(random_odd)) #check len
print("len list : ",len(random_number))#check len
print(random_number)#show list
random.choices has an argument weights, which can makes the odd number roughly twice as many as the even number:
>>> nums = range(1, 11)
>>> random.choices(nums, weights=[2 / 3, 1 / 3] * 5, k=50)
[7, 4, 4, 2, 1, 1, 7, 9, 6, 9, 3, 9, 9, 5, 9, 3, 2, 1, 9, 10, 9, 3, 7, 5, 3, 8, 8, 10, 9, 5, 2, 9, 9, 6, 5, 5, 10, 5, 6, 5, 1, 3, 3, 2, 5, 6, 5, 5, 5, 2]
>>> sum(i % 2 == 0 for i in _)
16
>>> (50 - 16) / 16
2.125
If you have strict requirements on their quantitative relationship, you can generate two lists, concatenate and shuffle:
>>> n_even = 50 // 3
>>> n_odd = 50 - n_even
>>> rand = random.choices(nums[::2], k=n_odd) + random.choices(nums[1::2], k=n_even)
>>> random.shuffle(rand)
>>> rand
[2, 10, 5, 8, 7, 10, 3, 5, 1, 7, 8, 10, 7, 5, 3, 9, 3, 10, 3, 2, 1, 10, 9, 8, 7, 6, 1, 1, 8, 1, 5, 4, 9, 9, 2, 1, 5, 1, 4, 7, 1, 2, 5, 7, 7, 1, 7, 1, 1, 5]

list that contains only the elements that are common between 2 imput lists (I can not do it without duplicate) [duplicate]

This question already has answers here:
Common elements comparison between 2 lists
(14 answers)
Closed 2 years ago.
From two lists, return a list that contains only the elements that are common between the 2 imput lists. Without duplicates.
Imput:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
My solution:
common_list = [i for i in a if i in b]
My output:
[1, 1, 2, 3, 5, 8, 13]
Output i need:
[1, 2, 3, 5, 8, 13]
You can use the set operation
In [13]: a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
...: b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
In [14]: list(set(a) & set(b))
Out[14]: [1, 2, 3, 5, 8, 13]
The problem with your code is the duplicate elements in the output.. You can avoid that by applying set operator on the output
common_list = list(set(i for i in a if i in b))
You can use set intersection:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
out = list(set(a).intersection(set(b)))
print(out)
Output:
[1, 2, 3, 5, 8, 13]
As an alternative to Arun's answer, you can also do this:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
set(a).intersection(b)
Which I find more readable than set(a) & set(b) because that feels a little too "magical" to me.

Efficient way to encode a dataframe into a tree-like, nested-if structure?

Suppose I had a pandas dataframe that looked something like this:
A B C D value
1 4 6 9 100
1 4 6 10 101
1 5 7 9 100
1 5 7 11 102
1 5 8 10 105
That is, there are some identifying features whose combination uniquely identify a row, then some value. In this case there are 4 identifying features in A, B, C, D and the combination of the four values will be unique within the dataframe.
And I want to print the following nested if statement:
if A == 1
if B == 4
if C == 6
if D == 9
100
if D == 10
101
if B == 5
if C == 7
if D == 9
100
if D == 11
102
if C == 8
if D == 10
105
What is an efficient (in terms of memory required to store the string) way to encode data with a variable number of identifiers A, B, ... into this format, assuming I know the identifying columns are already arranged in order of increasing cardinality?
and is also allowed, so I can also represent the tree like:
if A == 1
if B == 4 and C == 6
if D == 9
100
if D == 10
101
if B == 5
if C == 7
if D == 9
100
if D == 11
102
if C == 8 and D == 10
105
Would be ecstatic about the latter, but a solution that achieves the former will already solve my problem!
Here is a sample df I cobbled together:
pd.DataFrame({'A': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
'B': [4, 4, 5, 5, 5, 4, 4, 4, 4, 5, 4, 5, 5, 5, 5],
'C': [6, 6, 7, 7, 8, 6, 7, 7, 7, 8, 6, 7, 8, 8, 8],
'D': [9, 10, 9, 11, 10, 12, 12, 13, 15, 10, 9, 10, 9, 16, 17],
'value': [100, 101, 100, 102, 105, 103, 103, 100, 101, 107, 102, 100, 111, 105, 109]})
Okay, so this isn't exactly what you asked for, but it I think its worth considering.
Maybe you could train a decision tree classifier. You could then export the tree and then write some custom code to convert a tree to if-statements, which should be straight-forward. Here is some code I was tinkering with:
import pandas as pd
from sklearn import tree
from graphviz import Source
import numpy as np
data = pd.DataFrame({'A': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
'B': [4, 4, 5, 5, 5, 4, 4, 4, 4, 5, 4, 5, 5, 5, 5],
'C': [6, 6, 7, 7, 8, 6, 7, 7, 7, 8, 6, 7, 8, 8, 8],
'D': [9, 10, 9, 11, 10, 12, 12, 13, 15, 10, 9, 10, 9, 16, 17],
'value': [100, 101, 100, 102, 105, 103, 103, 100, 101, 107, 102, 100, 111, 105, 109]})
X = data[['A', 'B', 'C', "D"]]
Y = data["value"]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)
# Display Tree
classes = Y.unique().astype(np.str)
Source( tree.export_graphviz(clf, out_file=None, feature_names=X.columns, class_names=classes) )

Combining data contained in several lists

I am working on a personal project in python 3.6. I used pandas to import the data from an excel file in a dataframe and then I extracted data into several lists.
Now, I will give an example to illustrate exactly what I am trying to achieve.
So I have let's say 3 input lists a,b and c(I did insert the index and some additional white spaces for in lists so it is easier to follow):
0 1 2 3 4 5 6
a=[1, 5, 6, [10,12,13], 1, [5,3] ,7]
b=[3, [1,2], 3, [5,6], [1,3], [5,6], 9]
c=[1, 0 , 4, [1,2], 2 , 8 , 9]
I am trying to combine the data in order to get all the combinations when in one of the lists there is a list containing multiple elements. So the output needs to be like this:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
a=[1, 5, 5, 6, 10,10,10, 10, 12, 12, 12, 12, 13, 13, 13, 13, 1, 1, 5, 5, 3, 3, 7]
b=[3, 1, 2, 3, 5, 5, 6, 6, 5, 5, 6, 6, 5, 5, 6, 6, 1, 3, 5, 6, 5, 6, 9]
c=[1, 0, 0, 4, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 8, 8, 8, 8, 9]
To make this more clear:
From the original lists if we look at index 1 elements:
a[1]=5, b[1]=[1,2], c[1]=0. These got transformed to the following values on the 1 and 2 index positions: a[1:3]=[ 5, 5 ]; b[1:3]=[1, 2]; c[1:3]=[ 0, 0]
This needs to be applied also to index 3, 4, and 5 in the original input lists in order to obtain something similar to the example output above.
I want to be able to generalize this to more lists (a,b,c.....n). I have been able to do this for two lists, but in a totally not elegant, definitely not pythonic way. Also I think the code I wrote can't be generalized to more lists.
I am looking for some help, at least some pointers to some reading material that can help me achieve what I presented above.
Thank you!
You could do something like this.
Looks at each column, works out the combinations, then output the list:
import pandas as pd
import numpy
a=[1, 5, 6, [10,12,13], 1, [5,3] ,7]
b=[3, [1,2], 3, [5,6], [1,3], [5,6], 9]
c=[1, 0 , 4, [1,2], 2 , 8 , 9]
df = pd.DataFrame([a,b,c])
final_df = pd.DataFrame()
i=0
for col in df.columns:
temp_df = pd.DataFrame(df[col])
get_combo = []
for idx, row in temp_df.iterrows():
get_combo.append([row[i]])
combo_list = [list(x) for x in numpy.array(numpy.meshgrid(*get_combo)).T.reshape(-1,len(get_combo))]
temp_df_alpha = pd.DataFrame(combo_list).T
i+=1
if len(final_df) == 0:
final_df = temp_df_alpha
else:
final_df = pd.concat([final_df, temp_df_alpha], axis=1, sort=False)
for idx, row in final_df.iterrows():
print (row.tolist())
Output:
[1, 5, 5, 6, 10, 10, 12, 12, 13, 13, 10, 10, 12, 12, 13, 13, 1, 1, 5, 5, 3, 3, 7]
[3, 1, 2, 3, 5, 6, 5, 6, 5, 6, 5, 6, 5, 6, 5, 6, 1, 3, 5, 6, 5, 6, 9]
[1, 0, 0, 4, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 8, 8, 8, 8, 9]

List comprehension logic not working and I'm not sure why [duplicate]

This question already has answers here:
Common elements between two lists with no duplicates
(7 answers)
Closed 3 years ago.
The exercise I'm doing requires me to create and print out a list containing all of the common elements in the 2 following lists without duplicates:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
I'm trying to create the new list in one line of code and I think my logic is correct but obviously there's an issue with it somewhere.
Here's what is currently not working:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
common_list = []
common_list = [nums for nums in a if (nums in b and nums not in common_list)]
print(common_list)
I expect to get [1, 2, 3, 5, 8, 13] but the 1 is still duplicated even though I have the 'nums not in common_list' condition so I end up getting
[1, 1, 2, 3, 5, 8, 13]
As already mentioned in other answers and comment, your problem is that, during the list comprehension, common_list is empty.
Now for practical solutions: if order is not important, sets are your friends:
common_list = list(set(a) & set(b))
and if order is important, sets are still your friends:
seen = set()
bset = set(b) # makes `in` test much faster
common_list = []
for item in a:
if item in seen:
continue
if item in bset:
common_list.append(item)
seen.add(item)
Instead of using a list I suggest you to use a set to avoid duplicate values.
common_set = set()
You can add items by:
common_set.add(value)
Finally you can print values by:
print(common_set)
you can use enumerate:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
res = [i for n, i in enumerate(a) if i not in a[:n] and i in b]
print (res)
output:
[1, 2, 3, 5, 8, 13]
One way to do this with lists is (assuming that the one of the lists has no duplicates):
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
c = [x for x in a if x in b]
print(c)
# [1, 2, 3, 5, 8, 13]
or, for any list:
a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
c = []
for x in a + b:
if x in a and x in b and x not in c:
c.append(x)
print(c)
# [1, 2, 3, 5, 8, 13]
But sets are much better suited for this:
a = {1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89}
b = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13}
c = a.intersection(b)
print(c)
# {1, 2, 3, 5, 8, 13}
One-liner:
list(set(a).intersection(b))

Categories

Resources