Python function that similar to bash find command - python

I have a dir structure like the following:
[me#mypc]$ tree .
.
├── set01
│   ├── 01
│   │   ├── p1-001a.png
│   │   ├── p1-001b.png
│   │   ├── p1-001c.png
│   │   ├── p1-001d.png
│   │   └── p1-001e.png
│   ├── 02
│   │   ├── p2-001a.png
│   │   ├── p2-001b.png
│   │   ├── p2-001c.png
│   │   ├── p2-001d.png
│   │   └── p2-001e.png
I would like to write a python script to rename all *a.png to 01.png, *b.png to 02.png, and so on. Frist I guess I have to use something similar to find . -name '*.png', and the most similar thing I found in python was os.walk. However, in os.walk I have to check every file, if it's png, then I'll concatenate it with it's root, somehow not that elegant. I was wondering if there is a better way to do this? Thanks in advance.

For a search pattern like that, you can probably get away with glob.
from glob import glob
paths = glob('set01/*/*.png')

You can use os.walk to traverse the directory tree.
Maybe this works?
import os
for dpath, dnames, fnames in os.walk("."):
for i, fname in enumerate([os.path.join(dpath, fname) for fname in fnames]):
if fname.endswith(".png"):
#os.rename(fname, os.path.join(dpath, "%04d.png" % i))
print "mv %s %s" % (fname, os.path.join(dpath, "%04d.png" % i))
For Python 3.4+ you may want to use pathlib.glob() with a recursive pattern (e.g., **/*.png) instead:
Recursively iterate through all subdirectories using pathlib
https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob
https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob

Check out genfind.py from David M. Beazley.
# genfind.py
#
# A function that generates files that match a given filename pattern
import os
import fnmatch
def gen_find(filepat,top):
for path, dirlist, filelist in os.walk(top):
for name in fnmatch.filter(filelist,filepat):
yield os.path.join(path,name)
# Example use
if __name__ == '__main__':
lognames = gen_find("access-log*","www")
for name in lognames:
print name

These days, pathlib is a convenient option.

Related

Get directories from the current to the n-th depth

Suppose a directory structure as:
├── parent_1
│   ├── child_1
│   │   ├── sub_child_1
│   │   │   └── file_1.py
│   │   └── file_2.py
│   └── file_3.py
├── parent_2
│   └── child_2
│   └── file_4.py
└── file_5.py
I want to get two arrays:
parents = ["parent_1", "parent_2"]
children = ["child_1", "child_2"]
Note that files and sub_child_1 are not included.
Using suggestions such as this, I can write:
parents = []
children = []
for root, dir, files in os.walk(path, topdown=True):
depth = root[len(path) + len(os.path.sep):].count(os.path.sep)
if depth == 0:
parents.append(dir)
elif depth == 1:
children.append(dir)
However, this is a bit wordy and I was wondering if there is a cleaner way of doing this.
Update 1
I also tried a listdir-based approach:
parents = [f for f in listdir(root) if isdir(join(root, f))]
children = []
for p in parents:
children.append([f for f in listdir(p) if isdir(join(root, p, f))])
You can clear the directories returned by os.walk to prevent it from traversing deeper when it has reached your desired depth:
for root, dirs, _ in os.walk(path, topdown=True):
if root == path:
continue
parents.append(root)
children.extend(dirs)
dirs.clear()

Python directory structure for modules

I have the following directory and file structure in my current directory:
├── alpha
│   ├── A.py
│   ├── B.py
│   ├── Base.py
│   ├── C.py
│   └── __init__.py
└── main.py
Each file under the alpha/ directory is it's own class and each of those classes inheirts the Base class in Base.py. Right now, I can do something like this in main.py:
from alpha.A import *
from alpha.B import *
from alpha.C import *
A()
B()
C()
And it works fine. However, if I wanted to add a file and class "D" and then use D() in main.py, I'd have to go into my main.py and do "from alpha.D import *". Is there anyway to do an import in my main file so that it imports EVERYTHING under the alpha directory?
depens what you are trying to do with the objects, one possible solution could be:
import importlib
import os
for file in os.listdir("alpha"):
if file.endswith(".py") and not file.startswith("_") and not file.startswith("Base"):
class_name = os.path.splitext(file)[0]
module_name = "alpha" + '.' + class_name
loaded_module = importlib.import_module(module_name)
loaded_class = getattr(loaded_module, class_name)
class_instance = loaded_class()
Importing everything with * is not a good practice, so if your files have only one class, importing this class is "cleaner" ( class_name is in your case)

Import Error in django view?

This is my project directory
├── feed
│   ├── admin.py
│   ├── apps.py
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── migrations
│   │   ├── 0001_initial.py
│   │   ├── 0002_auto_20180722_1431.py
│   │  
│   ├── models.py
│   ├── __pycache__
│   │ 
│   ├── templates
│   │   └── feed
│   │   ├── base.html
│   │   ├── footer.html
│   │   ├── header.html
│   │   ├── hindustantimes.html
│   │   ├── index.html
│   │   ├── ndtv.html
│   │   ├── News_Home.html
│   │   ├── republic.html
│   │   └── tredns.html
│   ├── tests.py
│   ├── twitter
│   │   ├── __init__.py
│   │   ├── __pycache__
│   │   │   └── twitter_credentials.cpython-36.pyc
│   │   ├── trends.py
│   │   ├── tweets.py
│   │   └── twitter_credentials.py
│   ├── urls.py
│   └── views.py
├── __init__.py
├── manage.py
├── os
├── pironews
│   ├── __init__.py
│   |
│   ├── __pycache__
│   │   ├── __init__.cpython-36.pyc
│   │   ├── settings.cpython-36.pyc
│   │   ├── urls.cpython-36.pyc
│   │   └── wsgi.cpython-36.pyc
│   ├── settings.py
│   |
│   ├── urls.py
│   └── wsgi.py
└── __pycache__
I want to import the trends.py file in my views:
I am doing the following in the views file
from pironews.feed.twitter.trends import main
def twitter_trend(request):
output = main()
print(output)
return HttpResponse(output, content_type='text/plain')
This is my trends.py file
from __future__ import print_function
import sys # used for the storage class
import requests
import pycurl # used for curling
import base64 # used for encoding string
import urllib.parse # used for enconding
from io import StringIO# used for curl buffer grabbing
import io
import re
import json # used for decoding json token
import time # used for stuff to do with the rate limiting
from time import sleep # used for rate limiting
from time import gmtime, strftime # used for gathering time
import twitter_credentials
OAUTH2_TOKEN = 'https://api.twitter.com/oauth2/token'
class Storage:
def __init__(self):
self.contents = ''
self.line = 0
def store(self, buf):
self.line = self.line + 1
self.contents = "%s%i: %s" % (self.contents, self.line, buf)
def __str__(self):
return self.contents
def getYear():
return strftime("%Y", gmtime())
def getMonth():
return strftime("%m", gmtime())
def getDay():
return strftime("%d", gmtime())
def getHour():
return strftime("%H", gmtime())
def getMinute():
return strftime("%M", gmtime())
def generateFileName():
return getYear()+"-"+getMonth()+"-"+getDay()+""
# grabs the rate limit remaining from the headers
def grab_rate_limit_remaining(headers):
limit = ''
h = str(headers).split('\n')
for line in h:
if 'x-rate-limit-remaining:' in line:
limit = line[28:-1]
return limit
# grabs the time the rate limit expires
def grab_rate_limit_time(headers):
x_time = ''
h = str(headers).split('\n')
for line in h:
if 'x-rate-limit-reset:' in line:
x_time = line[24:-1]
return x_time
# obtains the bearer token
def get_bearer_token(consumer_key,consumer_secret):
# enconde consumer key
consumer_key = urllib.parse.quote(consumer_key)
# encode consumer secret
consumer_secret = urllib.parse.quote(consumer_secret)
# print(type(consumer_secret))
# create bearer token
bearer_token = consumer_key+':'+consumer_secret
# base64 encode the token
base64_encoded_bearer_token = base64.b64encode(bearer_token.encode('utf-8'))
# set headers
headers = {
"Authorization": "Basic " + base64_encoded_bearer_token.decode('utf-8') + "",
"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8",
"Content-Length": "29"}
response = requests.post(OAUTH2_TOKEN, headers=headers, data={'grant_type': 'client_credentials'})
to_json = response.json()
return to_json['access_token']
def grab_a_tweet(bearer_token, tweet_id):
# url
url = "https://api.twitter.com/1.1/trends/place.json"
formed_url ='?id='+tweet_id+'&result_type=popular' #include_entities=true
headers = [
str("GET /1.1/statuses/show.json"+formed_url+" HTTP/1.1"),
str("Host: api.twitter.com"),
str("User-Agent: jonhurlock Twitter Application-only OAuth App Python v.1"),
str("Authorization: Bearer "+bearer_token+"")
]
buf = io.BytesIO()
tweet = ''
retrieved_headers = Storage()
pycurl_connect = pycurl.Curl()
pycurl_connect.setopt(pycurl_connect.URL, url+formed_url) # used to tell which url to go to
pycurl_connect.setopt(pycurl_connect.WRITEFUNCTION, buf.write) # used for generating output
pycurl_connect.setopt(pycurl_connect.HTTPHEADER, headers) # sends the customer headers above
pycurl_connect.setopt(pycurl_connect.HEADERFUNCTION, retrieved_headers.store)
#pycurl_connect.setopt(pycurl_connect.VERBOSE, True) # used for debugging, really helpful if you want to see what happens
pycurl_connect.perform() # perform the curl
tweet += buf.getvalue().decode('UTF-8') # grab the data
pycurl_connect.close() # stop the curl
#print retrieved_headers
pings_left = grab_rate_limit_remaining(retrieved_headers)
reset_time = grab_rate_limit_time(retrieved_headers)
current_time = time.mktime(time.gmtime())
return {'tweet':tweet, '_current_time':current_time, '_reset_time':reset_time, '_pings_left':pings_left}
def main():
consumer_key = twitter_credentials.CONSUMER_KEY # put your apps consumer key here
consumer_secret = twitter_credentials.CONSUMER_SECRET # put your apps consumer secret here
bearer_token = get_bearer_token(consumer_key,consumer_secret)
tweet = grab_a_tweet(bearer_token,'23424848') # grabs a single tweet & some extra bits
print(type(tweet['tweet']))
print(tweet['_current_time'])
json_obj = json.loads(tweet['tweet'])
for i in json_obj:
for j in i['trends']:
print(j['name'])
But when I run the server I am getting following error
PiroProject/pironews/feed/views.py", line 20, in <module>
from pironews.feed.twitter.trends import main
ModuleNotFoundError: No module named 'pironews.feed'
What should I do now?
There are two things which needs to be done.
You need to do from .trends import main in the ____init___.py file twitter folder
You can then import in the view.py file from .twitter import main
Try changing from pironews.feed.twitter.trends import main to from twitter.trends import main.

Walk subdirectories in Python starting from the subdirectories

I've the following dir structure:
root
└── env
   ├── team_1
   │   ├── policies
│ │ └── file.yaml
   │   └── roles
   └── team_2
   ├── policies
   └── roles
and I need to read all the files under a team directory and merge them to create one unique file.
This is my attempt:
env_path = os.path.join('root', env)
if os.path.exists(env_path):
for team_dir in os.listdir(env_path):
for root, dirs, files in os.walk(team_dir):
print(root, dirs, files)
The problem is that os.walk doesn't return anything when I pass team_dir. I should use os.path.join(env_path, team_dir) but at that point it returns the entire tree which I don't want. How can youreturn from os.walk the subdirs of already a subdir?
you have to use os.path.join(env_path, team_dir) or os.walk won't find anything.
But if you don't want all the hierarchy, just remove the start of the string:
for team_dir in os.listdir(env_path):
for root, dirs, files in os.walk(os.path.join(env_path, team_dir)):
for f in files+dirs:
print(os.path.join(root,f)[len(env_path)+1:]) # strip start of path + separator

Get full file path from GtkTreeView

So, I found a tutorial on creating a file browser using Gtk.TreeView but I'm facing a problem, when I select a file inside a folder I cant get the file's full path. I can get the model path but I don't know what to do with it.
This is my project tree:
.
├── browser
│ ├── index.html
│ └── markdown.min.js
├── compiler.py
├── ide-preview.png
├── __init__.py
├── main.py
├── __pycache__
│ ├── compiler.cpython-35.pyc
│ └── welcomeWindow.cpython-35.pyc
├── pyide-settings.json
├── README.md
├── resources
│ └── icons
│ ├── git-branch.svg
│ ├── git-branch-uptodate.svg
│ └── git-branch-waitforcommit.svg
├── test.py
├── WelcomeWindow.glade
└── welcomeWindow.py
When I click on main.py the path is 4, but if I click on browser/markdown.min.js I get 0:1.
In my code I check if the path's length (I split the path by ':') is bigger than 1, if not I open the file normally, if it is... This is where I'm stuck. Anyone can help?
Here is my TreeSelection on changed function:
def onRowActivated(self, selection):
# print(path.to_string()) # Might do the job...
model, row = selection.get_selected()
if row is not None:
# print(model[row][0])
path = model.get_path(row).to_string()
pathArr = path.split(':')
fileFullPath = ''
if not os.path.isdir(os.path.realpath(model[row][0])):
# self.openFile(os.path.realpath(model[row][0]))
if len(pathArr) <= 1:
self.openFile(os.path.realpath(model[row][0]))
else:
# Don't know what to do!
self.languageLbl.set_text('Language: {}'.format(self.sbuff.get_language().get_name()))
else:
print('None')
Full code is available at https://github.com/raggesilver/PyIDE/blob/master/main.py
Edit 1: Just to be more specific, my problem is that when I get the name of the file from the TreeView, I can't get the path before it, so I get index.html instead of browser/index.html.
I found a solution to my problem, the logic was to iterate through the path (e.g.: 4:3:5:0) backwards and get the last parent's name and then prepend to the path variable. So we have:
def onRowActivated(self, selection):
model, row = selection.get_selected()
if row is not None:
fullPath = ''
cur = row
while cur is not None:
fullPath = os.path.join(model[cur][0], fullPath)
cur = model.iter_parent(cur)
# do whatever with fullPath

Categories

Resources