For loop cycle order - python

I am creating a short script which tweets automatically via twitter API. Besides setting up the API credentials (out of the scope for the question) I import the following library:
import os
I have set my working directory to be a folder where I have 3 photos. If I run os.listdir('.') I get the following list.
['Image_1.PNG',
'Image_2.PNG',
'Image_3.jpg',]
"mylist" is a list of strings, practically 3 tweets.
The code that posts in Twitter automatically looks like that:
for image in os.listdir('.'):
for num in range(len(mylist)):
api.update_with_media(image, mylist[num])
The code basically assigns to the first image a tweet and posts. Then to the same image the second tweet and posts. Again first image - third tweet. Then it continues the cycle to second and third image altogether 3*3 9 times/posts.
However what I want to achieve is to take the first image with the first tweet and post. Then take second image with second tweet and post. Third image - third tweet. Then I want to run the cycle one more time: 1st image - 1st tweet, 2nd image - 2nd tweet ...etc.

Use zip to iterate through two (or more) collections in parallel
for tweet, image in zip(mylist, os.listdir('.')):
api.update_with_media(image, tweet)
To repeat it more times, you can put this cycle inside another for

Assuming the length of os.listdir('.') and mylist are equal:
length = len(mylist) # If len(os.listdir('.')) is greater than len(mylist),
# replace mylist with os.listdir('.')
imageList = os.listdir('.')
iterations = 2 # The number of time you want this to run
for i in range(0,iterations):
for x in range(0, length):
api.update_with_media(imageList[x], mylist[num])

Related

Downloading tweets with Tweepy

I have a script that downloads a number of tweets using Cursor function of Tweepy. The issue is if I specify the number of tweets to be downloaded, Tweepy downloads so many many tweets of which 90 percent are duplicates. Below is my exact code snippet.
qw = ['Pele']
tweet_dataset = pd.DataFrame(columns=['Tweet_id','Author'])
for tweet in tw.Cursor(api.search_tweets,tweet_mode='extended', q=qw).items(5):
appending_dataframe = pd.DataFrame([[tweet.id,tweet.author.screen_name]],
columns=['Tweet_id','Author'])
tweet_dataset = tweet_dataset.append(appending_dataframe)
print(tweet_dataset[['Author','Tweet_id']].head())
From the above script I only want to return 5 tweets, instead it loops, the first time 1 tweet, the second time two tweets ... until it reaches the fifth time and return 5 tweets. Please see below snippet of the results:
(https://i.stack.imgur.com/Dnm7y.png)
I only want say 5 tweets from cursor not 5 groups of tweets as Cursor returns it.
The head method returns by default the first 5 lines.
Therefore, at every iteration you are printing the first 5 lines. Which returns 1 line in the first iteration, as there is only one line, 2 lines in the second iteration, and so on.
.head(1) would instead return one line at a time.

select and filtered files in directory with enumerating in loop

I have a folder that contains many eof extension files name I want to sort them in ordinary way with python code (as you can see in my example the name of all my files contain a date like:20190729_20190731 and they are just satellite orbital information files, then select and filtered 1th,24th,47th and.... (index ) of files and delete others because I need every 24 days information files( for example:V20190822T225942_20190824T005942) not all days information .for facility I select and chose these information files from first day I need so the first file is found then I should select 24 days after from first then 47 from first or 24 days after second file and so on. I exactly need to keep my desire files as I said and delete other files in my EOF source folder my desire files are like these
S1A_OPER_AUX_POEORB_OPOD_20190819T120911_V20190729T225942_20190731T005942.EOF
S1A_OPER_AUX_POEORB_OPOD_20190912T120638_V20190822T225942_20190824T005942.EOF
.
.
.
Mr Zach Young wrote this code below and I appreciate him so much I never thought some body would help me. I think I'm very close to the goal
the error is
error is print(f'Keeping {eof_file}') I changed syntax but the same error: print(f"Keeping {eof_file}")
enter code here
from importlib.metadata import files
import pprint
items = os.listdir("C:/Users/m/Desktop/EOF")
eof_files = []
for item in items:
# make sure case of item and '.eof' match
if item.lower().endswith('.eof'):
eof_files.append(item)
eof_files.sort(key=lambda fname : fname.split('_')[5])
print('All EOF files, sorted')
pprint.pprint(eof_files)
print('\nKeeping:')
files_to_delete = []
count = 0
offset = 2
for eof_file in eof_files:
if count == offset:
print(f"Keeping: [eof_file]")
# reset count
count = 0
continue
files_to_delete.append(eof_file)
count += 1
print('\nRemoving:')
for f_delete in files_to_delete:
print(f'Removing: [f_delete]')
staticmethod
Here's a top-to-bottom demonstration.
I recommend that you:
Run that script as-is and make sure your print statements match mine
Swap in your item = os.listdir(...), and see that your files are properly sorted
Play with the offset variable and make sure you can control what should be kept and what should be deleted; notice that an offset of 2 keeps every third file because count starts at 0
You might need to play around and experiment to make sure you're happy before moving to the final step:
Finally, swap in your os.remove(f_delete)
#!/usr/bin/env python3
from importlib.metadata import files
import pprint
items = [
'foo_bar_baz_bak_bam_20190819T120907_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120901_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120905_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120902_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120903_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120904_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120906_V2..._SomeOtherDate.EOF',
'bogus.txt'
]
eof_files = []
for item in items:
# make sure case of item and '.eof' match
if item.lower().endswith('.eof'):
eof_files.append(item)
eof_files.sort(key=lambda fname : fname.split('_')[5])
print('All EOF files, sorted')
pprint.pprint(eof_files)
print('\nKeeping:')
files_to_delete = []
count = 0
offset = 2
for eof_file in eof_files:
if count == offset:
print(f'Keeping {eof_file}')
# reset count
count = 0
continue
files_to_delete.append(eof_file)
count += 1
print('\nRemoving:')
for f_delete in files_to_delete:
print(f'Removing {f_delete}')
When I run that, I get:
All EOF files, sorted
['foo_bar_baz_bak_bam_20190819T120901_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120902_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120903_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120904_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120905_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120906_V2..._SomeOtherDate.EOF',
'foo_bar_baz_bak_bam_20190819T120907_V2..._SomeOtherDate.EOF']
Keeping:
Keeping foo_bar_baz_bak_bam_20190819T120903_V2..._SomeOtherDate.EOF
Keeping foo_bar_baz_bak_bam_20190819T120906_V2..._SomeOtherDate.EOF
Removing:
Removing foo_bar_baz_bak_bam_20190819T120901_V2..._SomeOtherDate.EOF
Removing foo_bar_baz_bak_bam_20190819T120902_V2..._SomeOtherDate.EOF
Removing foo_bar_baz_bak_bam_20190819T120904_V2..._SomeOtherDate.EOF
Removing foo_bar_baz_bak_bam_20190819T120905_V2..._SomeOtherDate.EOF
Removing foo_bar_baz_bak_bam_20190819T120907_V2..._SomeOtherDate.EOF

Copy specified number of lines to multiple documents with a Python script that can be run from the command line

I am trying to build a script that copies a specified number of lines from one document to multiple other documents. The copied lines are supposed to be appended to the end of the docs. In case I want to delete lines from the end of the docs, the script also has to be able to delete a specified number of lines.
I want to be able to run the script from the command line and want to pass two args:
"add" or "del"
number of lines (counting from the end of the document)
A command could look like this:
py doccopy.py add 2 which would copy the last 2 lines to the other docs, or:
py doccopy.py del 4 which would delete the last 4 lines from all docs.
So far, I have written a function that copies the number of lines I want from the original document,
def copy_last_lines(number_of_lines):
line_offset = [0]
offset = 0
for line in file_to_copy_from:
line_offset.append(offset)
offset += len(line)
file_to_copy_from.seek(line_offset[number_of_lines])
changedlines = file_to_copy_from.read()
a function that pastes said lines to a document
def add_to_file():
doc = open(files_to_write[file_number], "a")
doc.write("\n")
doc.write(changedlines.strip())
doc.close()
and a main function:
def main(action, number_of_lines):
if action == "add":
for files in files_to_write:
add_to_file()
elif action == "del":
for files in files_to_write:
del_from_file()
else:
print("Not a valid action.")
The main function isn't done yet, of course and I have yet to figure out how to realize the del_from_file function.
I also have problems with looping through all the documents.
My idea was to make a list including all the paths to the documents i want to write in and then loop through this list and to make a single variable for the "original" document, but I don't know if that's even possible the way I want to do it.
If possible, maybe someone has an idea for how to realize all this with a single list, have the "original" document be the first entry and loop through the list starting with "1" when writing to the other docs.
I realize that the code I've done so far is a total clusterfuck and I ask a lot of questions, so I'd be grateful for every bit of help. I'm totally new to programming, I just did a Python crash course in the last 3 days and my first own project is shaping out to be way more complicated than I thought it would be.
This should do what you ask, I think.
# ./doccopy.py add src N dst...
# Appends the last N lines of src to all of the dst files.
# ./doccopy.py del N dst...
# Removes the last N lines from all of the dst files.
import sys
def process_add(args):
# Fetch the last N lines of src.
src = argv[0]
count = int(args[1])
lines = open(src).readlines()[-count:]
# Copy to dst list.
for dst in args[2:}
open(dst,'a').write(''.join(lines))
def process_del(args):
# Delete the last N lines of each dst file.
count = int(args[0])
for dst in args[1:]:
lines = open(dst).readlines()[:-count]
open(dst,'w').write(''.join(lines))
def main():
if sys.argv[1] == 'add':
process_add( sys.argv[2:] )
elif sys.argv[1] == 'del':
process delete( sys.argv[2:] )
else:
print( "What?" )
if __name__ == "__main__":
main()

List Index Out of Range on Python. Nothing works

I have already reviewed multiple threads with similar answers to my question. Nothing seems to be working no matter what I try.
I am trying to create 100 random numbers, and put those random numbers into a list. However I keep getting
File "E:\WorkingWithFiles\funstuff.py", line 17, in randNumbs
numbList[index]+=1
IndexError: list index out of range
My code is:
def randNumbs(numbCount):
numbList=[0]*100
i=1
while i < 100:
index = random.randint(1,100)
numbList[index]+=1
i+=1
print (numbList)
return (numbList)
After reviewing multiple threads and tinkering around I cannot seem to get an answer.
Before I continue here is the scope of the project:
I have a .txt file thats a dictionary with however many words are in it. First, I write a function to calculate how many words are in the .txt file. Second, I generate 100 random numbers between 1 and the amount of words in the .txt file. Lastly I need to create a .txt file that prints
"Number Word"
120 Bologna
and so on. I am having trouble generating the random numbers. If anybody has any idea on why my list index is out of range and how to help, all help would be appreciated! Thank you!
Edit: the .txt file is 113k words long
You made a list of size 100 here:
numbList=[0]*100
Your problem is that you create indexes from 1 to 100 when you should be accessing indexes 0-99. Given a list of size n, the valid list indexes are 0 to n-1
Change your code to
index = random.randint(0,99)
Looks like an off-by-one error. randint will return numbers 1 to 100, while your list has indexes 0 to 99.
Also, you can rewrite your code like this:
def randNumbs(numbCount):
return [random.randint(1, 100) for i in range(numbCount)]
I would approach the problem a little differently:
from random import sample
SAMPLE_SIZE = 100
# load words
with open("dictionary.txt") as inf:
words = inf.read().splitlines() # assumes one word per line
# pick word indices
# Note: this returns only unique indices,
# ie a given word will not be returned twice
num_words = len(words)
which_words = sample(range(num_words), SAMPLE_SIZE)
# Note: if you did not need the word indices, you could just call
# which_words = sample(words, SAMPLE_SIZE)
# and get back a list of 100 words directly
# if you want words in sorted order
which_words.sort()
# display selected words
print("Number Word")
for w in which_words:
print("{:6d} {}".format(w, words[w]))
which gives something like
Number Word
198 abjuring
2072 agitates
2564 alevin
6345 atrophies
8108 barrage
9155 begloom
10237 biffy
11078 bleedings
11970 booed
14131 burials
14531 cabal
# etc...
Here, I’ve tried to fix your code. Explanations in comments.
import random
def rand_numbs(numb_count):
# this will generate a list of length 100
# it will have indexes from 0 to 99
numbList = [0] * 100
# dont use a while loop...
# when a for loop will do
for _ in range(numb_count):
# randint(i, j) will generate a number
# between i and j both inclusive!
# which means that both i and j can be generated
index = random.randint(0, 99)
# remember that python lists are 0-indexed
# the first element is nlist[0]
# and the last element is nlist[99]
numbList[index] += 1
print (numbList)
return (numbList)

How to choose a random but non-recent asset from a list?

I have the issue of trying to pull a "random" item out of a database in my Flask app. This function only needs to return a video that wasn't recently watched by the user. I am not worried about multiple users right now. My current way of doing this does not work. This is what I am using:
#app.route('/_new_video')
def new_video():
Here's the important part I'm asking about:
current_id = request.args.get('current_id')
video_id = random.choice(models.Video.query.all()) # return list of all video ids
while True:
if video_id != current_id:
new_video = models.Video.query.get(video_id)
and then I return it:
webm = new_video.get_webm() #returns filepath in a string
mp4 = new_video.get_mp4() #returns filepath in a string
return jsonify(webm=webm,mp4=mp4,video_id=video_id)
The random range starts at 1 because the first asset was deleted from the database, so the number 0 isn't associated with an id. Ideally, the user would not get a video they had recently watched.
I recommend using a collections.deque to store a recently watched list. It saves a list like collection of items, and as you add to it, if it gets to its max length, it automatically drops the oldest items, on a first in, first out basis.
import collections
And here's a generator that you can use to get random vids, that haven't been recently watched. The denom argument will allow you to change the length of the recently watched list because it's used to determine the max length of your recently_watched as a fraction of your list of vids.
def gen_random_vid(vids, denom=2):
'''return a random vid id that hasn't been recently watched'''
recently_watched = collections.deque(maxlen=len(vids)//denom)
while True:
selection = random.choice(vids)
if selection not in recently_watched:
yield selection
recently_watched.append(selection)
I'll create a quick list to demo it:
vids = ['vid_' + c for c in 'abcdefghijkl']
And here's the usage:
>>> recently_watched_generator = gen_random_vid(vids)
>>> next(recently_watched_generator)
'vid_e'
>>> next(recently_watched_generator)
'vid_f'
>>> for _ in range(10):
... print(next(recently_watched_generator))
...
vid_g
vid_d
vid_c
vid_f
vid_e
vid_g
vid_a
vid_f
vid_e
vid_c

Categories

Resources