How to look up functions of a library in python? - python

I just installed this library that scrapes twitter data: https://github.com/kennethreitz/twitter-scraper
I wanted to find out the library's functions and methods so I can start interacting with the library. I have looked around StackOverflow on this topic and tried the following:
pydoc twitter_scraper
help(twitter_scraper)
dir(twitter_scraper)
imported inspect and ran functions = inspect.getmembers(module, inspect.isfunction)
Of the four things I have tried, I have only gotten an output from the inspect option so far. I am also unsure (excluding inspect) whether these codes should go in the terminal or a scratch file.
Still quite new at this. Thank you so much for reading everybody!

Excellent question! There are a few options in trying to grok (fully understand) a new library. In your specific case, twitter-scraper, the only function is get-tweets() and the entire library is under 80 lines long.
For the general case, in decreasing order of usefulness.
Carefully read the project's description on GitHub. The ReadMe is usually the most carefully written piece of documentation.
Larger libraries have formatted documentation at http://(package-name).readthedocs.org.
pydoc module_name works when the module is installed. ``help(module_name)works in an interactive Python session after you have done animport module_name. These both work from the "docstrings" or strategically placed comments in the source code. This is also whatmodule_name?` does in iPython.
dir(module_name) also requires an import. It lists all the entrypoints to the module, including lots of odd "dunder", or double underscore, you would not normally call or change.
Read the source code. Often, this is easier and more complete than the documentation. If you can bring up the code in an IDE, then jumping around works quickly.
Also, you asked about what can be used within a script:
import os
print("Welcome, human.")
print("dir() is a function, returning a list.")
print("This has no output")
a_list = dir(os)
print("but this does", dir(os))
print("The help() command uses pydoc to print to stdout")
help(os)
print("This program is gratified to be of use.")

It seems like this library lacks proper documentation, but the GitHub page provides some usage examples to help you get started.
>>> from twitter_scraper import get_tweets
>>> for tweet in get_tweets('kennethreitz', pages=1):
>>> print(tweet['text'])
P.S. your API is a user interface
s3monkey just hit 100 github stars! Thanks, y’all!
I’m not sure what this /dev/fd/5 business is, but it’s driving me up the wall.
…
To get more information, simply look at the source code at https://github.com/kennethreitz/twitter-scraper/blob/master/twitter_scraper.py. It seems like the only function is get_tweets, which, looking at the source code, takes in two arguments, the username and the number of pages (optional, defaults to 25).

Related

Keyboard Control Python

I want python to click Enter. However, I don't want to install any outside extensions. I want it to click enter without using them. I found many resources like win32 but I want to do it without using external resources. I don't mind which version of python it is on.
Is this possible?
If so, how?
I have looked on the web but couldn't find anything. Please, can someone help?
Thank you in advance.
Yes.
It should definitely be possible using the ctypes library to talk directly to the Windows dlls
If you don't want to install a helper module that has done all this for you, then you pretty much have to do it old school by going after the really low level code.
In short, call the SendInput function of user32.dll by using ctypes.windll.user32.SendInput. Good luck with getting the parameters correct - I don't have the patience to figure it out for you.
Here are the docs for the user32.dll SendInput API
Here is a helpful resource for figuring out the data types. This is actually part of ctypes and can be imported by import ctypes.wintypes but I find it instructional to read the actual code.
So for example, to create a WORD with the value of the Virtual Key Code VK_RETURN, I think it would be
>>> ctypes.wintypes.WORD(0x0D)
c_ushort(13)
But there is a lot more you have to piece together from there. Just build up your parameters and call the function.
Last hint, use the examples in the second link for how to build "C" type structures. Then build one yourself using ctypes.Structure

How to see changes for a mercurial file context?

I'm currently trying to write a script that will find all the files changed given a certain # in the task description, and I have gotten the script to work for that. But now I'm trying to sort it by whether the file was added, modified or removed. I've looked through the Mercurial API, but I can't find anything that can do what I want.
My code currently uses repo[revnum].description() and parses that to find which ones contain the #, and if they do, add the file context to a list.
This works fine and I can print a list of files, but I can't find a method to see what was done with each context. Can anyone help me out here, or point me to some better documentation?
Do you need to work with the Mercurial API? It is possible to do what you need by working with the output of hg log
In general, you should avoid writing scripts directly using the Mercurial API. It is better to write your scripts to use the CLI or perhaps even use hglib. As stated on the MercurialApi wiki:
For the vast majority of third party code, the best approach is to use
Mercurial's published, documented, and stable API: the command line
interface.
That being said, if you really need to use the API, you can use repo.status() to find the info you asked about:
modified, added, removed, deleted, unknown, ignored, clean = repo.status(revnum-1, revnum)
I ended up using something similar to what Tim said, although I did still use the API.
I imported commands from mercurial, and then called commands.status(repo.ui, repo, change=revnum)
I captured the output of this, using repo.ui.pushbuffer() and repo.ui.popbuffer() which was in the form
A file_path1
R file_path2
R file_path3
A file_path4
M file_path5
I parsed this input and sorted it into Add, remove, modify, etc..

Help with PyEPL logging

I have never used Python before, most of my programming has been in MATLAB and Unix. However, recently I have been given a new assignment that involves fixing an old PyEPL program written by a former employee (I've tried contacting him directly but he won't respond to my e-mails). I know essentially nothing about Python, and though I am picking it up, I thought I'd just quickly ask for some advice here.
Anyway, there are two issues at hand here, really. The first is this segment of the code:
exp = Experiment()
exp.setBreak()
vt = VideoTrack("video")
at = AudioTrack("audio")
kt = KeyTrack("key")
log = LogTrack("session")
clk = PresentationClock()
I understand what this is doing; it is creating a series of tracking files in the directory after the program is run. However, I have searched a bunch of online tutorials and can't find a reference to any of these commands in them. Maybe I'm not searching the right places or something, but I cannot find ANYTHING about this.
What I need to do is modify the
log = LogTrack("session")
segment of the code, so that all of the session.log files go into a new directory, separate from the other log files. But I also need to find a way to not only concatenate them into a single session.log file, but add a new column to that file that will add the subject number (the program is meant to be run by multiple subjects to collect data).
I am not asking anyone to do my work for me, but if anyone could give me some pointers, or any sort of advice, I would greatly appreciate it.
Thanks
I would first check if there is a line in the code
from some_module_name import *
This could easily explain why you can call these functions (classes?). It will also tell you what file to look in to modify the code for LogTrack.
Edit:
So, a little digging seems to find that LogTrack is part of PyEPL's textlog module. These other classes are from other modules. Somewhere in this person's code should be a line something like:
from PyEPL.display import VideoTrack
from PyEPL.sound import AudioTrack
from PyEPL.textlog import LogTrack
...
This means that these are classes specific to PyEPL. There are a few ways you could go about modifying how they work. You can modify the source of the LogTrack class so that it operates differently. Perhaps easier would be to simply subclass LogTrack and change some of its methods.
Either of these will require a fairly thorough understanding of how this class operates.
In any case, I would download the source from here, open up the code/textlog.py file, and start reading how LogTrack works.

Python/C "defs" file - what is it?

In the nautilus-python bindings, there is a file "nautilus.defs". It contains stanzas like
(define-interface MenuProvider
(in-module "Nautilus")
(c-name "NautilusMenuProvider")
(gtype-id "NAUTILUS_TYPE_MENU_PROVIDER")
)
or
(define-method get_mime_type
(of-object "NautilusFileInfo")
(c-name "nautilus_file_info_get_mime_type")
(return-type "char*")
)
Now I can see what most of these do (eg. that last one means that I can call the method "get_mime_type" on a "FileInfo" object). But I'd like to know: what is this file, exactly (ie. what do I search the web for to find out more info)? Is it a common thing to find in Python/C bindings? What is the format, and where is it documented? What program actually processes it?
(So far, I've managed to glean that it gets transformed into a C source file, and it looks a bit like lisp to me.)
To answer your "What program actually processes it?" question:
From Makefile.in in the src directory, the command that translates the .defs file into C is PYGTK_CODEGEN. To find out what PYGTK_CODEGEN is, look in the top-level configure.in file, which contains these lines:
AC_MSG_CHECKING(for pygtk codegen)
PYGTK_CODEGEN="$PYTHON `$PKG_CONFIG --variable=codegendir pygtk-2.0`/codegen.py"
AC_SUBST(PYGTK_CODEGEN)
AC_MSG_RESULT($PYGTK_CODEGEN)
So the program that processes it is a Python script called codegen.py, that apparently has some link with PyGTK. Now a Google search for PyGTK codegen gives me this link as the first hit, which says:
"PyGTK-Codegen is a system for automatically generating wrappers for interfacing GTK code with Python."
and also gives some examples.
As for: "What is the format, and where is it documented?". As others have said, the code looks a lot like simple Scheme. I couldn't find any documentation at all on codegen on the PyGTK site; this looks like one of those many dark corners of open source that isn't well documented. Your best bet would probably be to download a recent tarball for PyGTK, look through the sources for the codegen.py file and see if the file itself contains sufficient documentation.
All you need to create Python bindings for C code is to use the Python / C API. However, the API can be somewhat repetitive and redundant, and so various forms of automation may be used to create them. For example, you may have heard of swig. The LISP-like (Scheme) code that you see is simply a configuration file for PyGTK-Codegen, which is a similar automation program for creating bindings to Python.

Bash alias to Python script -- is it possible?

The particular alias I'm looking to "class up" into a Python script happens to be one that makes use of the cUrl -o (output to file) option. I suppose I could as easily turn it into a BASH function, but someone advised me that I could avoid the quirks and pitfalls of the different versions and "flavors" of BASH by taking my ideas and making them Python scripts.
Coincident with this idea is another notion I had to make a feature of legacy Mac OS (officially known as "OS 9" or "Classic") pertaining to downloads platform-independent: writing the URL to some part of the file visible from one's file navigator {Konqueror, Dolphin, Nautilus, Finder or Explorer}. I know that only a scant few file types support this kind of thing using some other command-line tools (exiv2, wrjpgcom, etc). Which is perfectly fine with me as I only use this alias to download single-page image files such as JPEGs anyways.
I reckon I might as well take full advantage of the power of Python by having the script pass the string which is the source URL of the download (entered by the user and used first by cUrl) to something like exiv2 which could write it to the Comment block, EXIF User Comment block, and (taking as a first and worst example) Windows XP's File Description field. Starting small is sometimes a good way to start.
Hope someone has advice or suggestions.
BZT
The relevant section from the Bash manual states:
Aliases allow a string to be
substituted for a word when it is used
as the first word of a simple command.
So, there should be nothing preventing you from doing e.g.
$ alias geturl="python /some/cool/script.py"
Then you could use it like any other shell command:
$ geturl http://example.com/excitingstuff.jpg
And this would simply call your Python program.
I thought Pycurl might be the answer. Ahh Daniel Sternberg and his innocent presumptions that everybody knows what he does. I asked on the list whether or not pycurl had a "curl -o" analogue, and then asked 'If so: How would one go about coding it/them in a Python script?' His reply was the following:
"curl.setopt(pycurl.WRITEDATA, fp)
possibly combined with:
curl.setopt(pycurl.WRITEFUNCITON, callback) "
...along with Sourceforge links to two revisions of retriever.py. I can barely recall where easy_install put the one I've got; how am I supposed to compare them?
It's pretty apparent this gentleman never had a helpdesk or phone tech support job in the Western Hemisphere, where you have to assume the 'customer' just learned how to use their comb yesterday and be prepared to walk them through everything and anything. One-liners (or three-liners with abstruse links as chasers) don't do it for me.
BZT

Categories

Resources