I can’t get python to open a link that uses the contents of a .txt file as a query string. I’m working on Python 3.7.0 and was able to write code that opens the website and checks a string that I’ve input directly, as well as open my text file and print the contents, but when I try to make the text file’s contents a query it throws an error.
I added lines that print the link that I would need to open to make sure it comes out correctly and that works fine, I can copy and paste it into my browser and get a correct result.
Here's the code I used
And a screenshot of the error I get
I'm a total beginner at this so any suggestions or explanations would be lifesavers!
The error is with the string being passed to the urlopen(). When it tries to open the link you get an HTTP 400 : Bad request error which means that something is wrong with the link you provided. The text possibly has spaces and you aren't escaping the characters properly. Here is the link which could help you.
Alternatively, you could also use the Python Requests library.
(Please include the code in the question rather than screenshot)
Check out the http you’re requesting does ‘actually’ exists. Moreover, I’m not sure how’s your .txt file looks like, but reexamine the code (.read() part) to make sure the data you wanted to add as a query is being handled correctly.
Related
I ran some commands on Jupyter Notebook and expected to get a printed output containing data in tabulated form in a .csv file, but then i get an uncompleted output
This is the result i get from the .csv file
I ran this command;
df1=pandas.read_csv("supermarkets.csv", on_bad_lines='skip')
df1
I expected to get a printed output in a tabulated like in the image attached......
The data get printed in well tabulated form here
Here is a link to the online version of the file
[pythonhow.com/supermarkets.csv]
Getting good, clean quality data where the file extension correctly matches the actual content is often a challenge. Assessing the state of the input data is generally always a very important first step.
It appears the data you are trying to get is also online here. Github will render that as a table in the browser because it has a viewer mode. To look at the 'raw' file content, click here. You'll see it is nice comma-delimited file with columns separated by commas and rows each on a different line. The header with the column names is on the first line.
Now open in a good text editor the file you have that you are working with and compare it to the content I pointed you at. That should guide you on what is the issue.
At this point you may just wish to switch to using the version of the file that I pointed you at.
Use the link below to obtain it as proper csv file:
https://raw.githubusercontent.com/kenvilar/data-analysis-using-python/master/supermarkets.csv
You should be able to paste that link in your browser and then right click on the page and choose 'Save as..' to download it to your locak machine. The obtained file should open just fine using the code you showed in the screenshot in your post here.
Please work on writing better questions with specific titles, see here for guidance. The title at present is overly broad and is actually not accurate. This code would not work with the data you apparently have even if you were running it inside a Python code-based script. And so it is not a Jupyter notebook issue. For how to think about making it specific, a good thing to keep in mind is to write for your future self. If you continue to use notebooks you'll have hundreds that would be considered a 'Jupyter Notebook issue', but what makes this issue different from those?
I believe there is an issue with your csv file, not the code.
To me it looks like the data in your csv file are written in json format.
Have you opened the supermarkets.csv file using excel? it should look like a table, not a json formatted file.
did you try df1.show() to see if the csv got read in the first place?
I am writing a small class assignment in python. The raw_input suppose to be a link like 'http://python-data.dr-chuck.net/comments_243948.xml'. If this works, then I can parse some of the data. I am doing this assignment using pycharm as the IDE. When it prompts me to enter a location and I type or paste in the above link and hit enter, it just opens the linked page and does not go in to process the rest of the data. Is there a way to enter this link without having it pop up the linked page? Please help me. thanks.
While Stack Overflow isn't a homework-answering site, I can provide pointers on documentation to look at:
urllib2.open will allow you to create a file like object which reads from a web address.
The ElementTree XML API will allow you to parse XMLs without 3rd-party libraries.
These two should provide enough examples to get you on your way.
If your problem is with PyCharm automatically redirecting URLs entered in the console (which is a problem I can't seem to reproduce), the easiest solution is to simply always use the terminal.
The same case as I encounter.
My solution is to write the input() with a non-whitespace Enter to escape of prompting to the URL.
I am writing a python script for mass-replacement of links(actually image and script sources) in HTML files; I am using lxml. There is one problem, the html files are quizzes and they have data packaged like this(there is also some Cyrillic here):
<input class="question_data" value="{"text":"<p>[1] је наука која се бави чувањем, обрадом и преносом информација помоћу рачунара.</p>","fields":[{"id":"1","type":"fill","element":{"sirina":"103","maxDuzina":"12","odgovor":["Информатика"]}}]}" name="question:1:data" id="id3a1"/>
When I try to print out this data in python using:
print "OLD_DATA:", data
It just prints out the error "UnicodeEncodeError: character maps to undefined". There are more of these elements. My goal is to change the links of images in the value part of input, but I can't change the links if I don't know how to print this data(or how it should be written to the file). How does Python handle(interpret) this? Please help. Thanks!!! :)
You're running into the same problem I've hit many times in the past. That error almost always means that the console environment you're using can't display the characters it's trying to print. It might be worth trying to log to a file instead, then opening the log in an editor that can display the characters.
If you really want to be able to see it on your console, it might be worth writing a function to screen the strings you're printing for unprintable characters
I also found a couple other StackOverflow posts that might be helpful in your efforts:
How do I get Cyrillic in the output, Python?
What is right way to use cyrillic in python lxml library
I would also recommend this article and python manual entry:
https://docs.python.org/2/howto/unicode.html
http://www.joelonsoftware.com/articles/Unicode.html
I want to download text files from pastebin.com.
Once I start the program it should look for text files that are being uploaded and "download" them once they're uploaded.
I know how to "download" them but not how to tell Python to click on one of the public files on http://pastebin.com/archive and then click on the "raw"-button to open a new tab that contains the "raw" content.
I googled a lot but literally nothing came up that would help me.
Thanks
Well, a program doesn't know how to "click" anything :). In order to retrieve information from a page, you simply need to send a GET request at the correct url. In your case, that would be http://pastebin.com/raw/4ffLHviP or any other code of the pastebin you want to download. You can retrieve codes manually, or e.g. by applying text parsers (regex, beautifulsoup...) on the archive page.
Note that, there is an API for scraping Pastebin (see http://pastebin.com/scraping). It is strongly recommended, if you want to extract consequent content from them, to use it. It is more "polite", may offer better service, and will avoid you to be blacklisted.
To choose a file you simply do the following:
Visit the link of the file, ex. http://pastebin.com/B8A6L7Zt
The raw content is already on that page, namely inside<textarea id='paste_code'>...</textarea>. So you just cut this content off, using regex for example.
Actually i want to insert an image in a web application which uses Python as server side scripting language. I am using Python 2.7 version in windows platform. I have written a simple script to insert an image in python language.
print "<img src='image.png'>"
Even in this script i am not getting any errors or warnings and the page is getting executed successfully but the image is not getting displayed. Also the specified image file exists in the same folder where the python file exists and even if we provide an absolute path of the image in src attribute of img tag
Shall i import any extra packages? if yes, then please mention them.
Please anybody suggest a solution to this problem.
No, you need not import any extra packages. Just use raw string in Python because if your path contains \n like characters, it should be interpreted for newline. Like this
print r'<img src="c:\path\new\image.png'
But when are printing html in server-side, you are actually doing CGI programming and I would suggest to start with some good tutorials.
This isn't a python issue, this is an issue with the HTML you are outputting.
The tag is fine. Have you checked that the extension of the image is the same, and that the case (capitalization) is exactly the same? If the file is named imAge.PNG and you put image.png it won't work.
Also, check the path you used. Make sure you are using forward slashes (/).