I am converting my python program from paramiko to ssh2. I have succeeded in authenticating and I can get a directory listing. Where I am stuck is as I process through the directory listing how do I recognize whether the "file" is a directory or a file. I see the attributes but of those I can only see atime being something I will want to use (to know how old the file is). Once I have done the opendir and readdir (and so have a listing of files) how do I recognize whether each is a file or a directory?
When I do the readdir I am returned:
Length of filename
filename
attributes
atime
filesize
flags
gid
mtime
permissions
uid
Haven't used ssh2-python myself but I would say to check the contents of flags. According to the library's documentation (as suggested by #NullPointerException) the possible values are:
LIBSSH2_SFTP_S_IFMT
Type of file mask
LIBSSH2_SFTP_S_IFIFO
Named pipe (fifo)
LIBSSH2_SFTP_S_IFCHR
Character special (character device)
LIBSSH2_SFTP_S_IFDIR
Directory
LIBSSH2_SFTP_S_IFBLK
Block special (block device)
LIBSSH2_SFTP_S_IFREG
Regular file
LIBSSH2_SFTP_S_IFLNK
Symbolic link
LIBSSH2_SFTP_S_IFSOCK
Socket
I would say flags is a bit field and you have to check if certain flag is "on" or not with a bitwise operator, for example, to check it it's a directory:
flags & LIBSSH2_SFTP_S_IFDIR == LIBSSH2_SFTP_S_IFDIR
Related
This is a screenshot of the execution:
As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.
But look at my files, please:
The folder is definitely there.
Update 2
The code
self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
line = json.dumps(dict(item)) # Change the item to a JSON format in one line
self.file.write(line) # Write the item to the file
UPDATE
When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?
Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \\?\ prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \\?\UNC\server\share.
For example:
import os
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, unicode) and
encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
path = winapi_path(os.path.join(u"JSONFiles",
item["category"],
item["action"],
item["source"],
fileName + ".json"))
>>> path = winapi_path("C:\\Temp\\test.txt")
>>> print path
\\?\C:\Temp\test.txt
See the following pages on MSDN:
Naming Files, Paths, and Namespaces
Defining an MS-DOS Device Name
Kernel object namespaces
Background
Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \\?\ prefix.
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
000000b4`52fc0f60 "\\?\C:\Temp\test.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
The result replaces \\?\ with the NT DOS devices prefix \??\, and copies the string into a native UNICODE_STRING:
0:000> dS b450f9ec18
000000b4`536b7de0 "\??\C:\Temp\test.txt"
If you use //?/ instead of \\?\, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).
If you use \\?\ for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
0000005b`c2ffbf30 "\\?\C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
0:000> dS 5bc0b3f068
0000005b`c3066d30 "\??\C:/Temp/test.txt"
Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in \BaseNamedObjects or \Sessions\[session number]\BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:
>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\\Temp')
>>> os.path.exists(u'\\\\?\\My/Device\\test.txt')
True
You may be wondering what \?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory \Sessions\0\DosDevices\[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the \Global?? directory.
Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.
An example of a DOS device link is \Global??\C: => \Device\HarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.
Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.
0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1 mov r10,rcx
0:000> !obja #r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
Name is \??\C:\Temp\test.txt
OBJ_CASE_INSENSITIVE
NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with \??\C:\Temp\test.txt. Then it replaces that with \Global??\C:Temp\test.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path \Device\HarddiskVolume2\Temp\test.txt.
Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.
The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.
below is Python 3 version regarding #Eryk Sun's solution.
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, str) and encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
#Python 3 renamed the unicode type to str, the old str type has been replaced by bytes. NameError: global name 'unicode' is not defined - in Python 3
Adding the solution that helped me fix a similar issue:
python version = 3.9, windows version = 10 pro.
I had an issue with the filename itself as it was too long for python's open built-in method. The error I got is that the path simply doesn't exist, although I use the 'w+' mode for open (which is supposed to open a new file regardless whether it exists or not).
I found this guide which solved the problem with a quick change to window's Registry Editor (specifically the Group Policy). Scroll down to the 'Make Windows 10 Accept Long File Paths' headline.
Don't forget to update your OS group policy to take effect immediately, a guide can be found here.
Hope this helps future searches as this post is quite old.
There can be multiple reasons for you getting this error. Please make sure of the following:
The parent directory of the folder (JSONFiles) is the same as the directory of the Python script.
Even though the folder exists it does not mean the individual file does. Verify the same and make sure the exact file name matches the one that your Python code is trying to access.
If you still face an issue, share the result of "dir" command on the innermost folder you are trying to access.
it works for me
import os
str1=r"C:\Users\sandeepmkwana\Desktop\folder_structure\models\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
If you get a long path (more than 258 characters) issue in Windows, then try this.
Even though every python grpc quickstart references using grpc_tools.protoc to generate python classes that implement a proto file, the closest thing to documentation that I can find simply says
Given protobuf include directories $INCLUDE, an output directory $OUTPUT, and proto files $PROTO_FILES, invoke as:
$ python -m grpc.tools.protoc -I$INCLUDE --python_out=$OUTPUT --grpc_python_out=$OUTPUT $PROTO_FILES
Which is not super helpful. I notice there are many limitations. For example using an $OUTPUT of .. just fails silently.
Where can I find documentation on this tool?
I thought you are asking about the Python plugin. Did you try -h?
$ python -m grpc.tools.protoc -h
Usage: /usr/local/google/home/lidiz/.local/lib/python2.7/site-packages/grpc_tools/protoc.py [OPTION] PROTO_FILES
Parse PROTO_FILES and generate output based on the options given:
-IPATH, --proto_path=PATH Specify the directory in which to search for
imports. May be specified multiple times;
directories will be searched in order. If not
given, the current working directory is used.
--version Show version info and exit.
-h, --help Show this text and exit.
--encode=MESSAGE_TYPE Read a text-format message of the given type
from standard input and write it in binary
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode=MESSAGE_TYPE Read a binary message of the given type from
standard input and write it in text format
to standard output. The message type must
be defined in PROTO_FILES or their imports.
--decode_raw Read an arbitrary protocol message from
standard input and write the raw tag/value
pairs in text format to standard output. No
PROTO_FILES should be given when using this
flag.
--descriptor_set_in=FILES Specifies a delimited list of FILES
each containing a FileDescriptorSet (a
protocol buffer defined in descriptor.proto).
The FileDescriptor for each of the PROTO_FILES
provided will be loaded from these
FileDescriptorSets. If a FileDescriptor
appears multiple times, the first occurrence
will be used.
-oFILE, Writes a FileDescriptorSet (a protocol buffer,
--descriptor_set_out=FILE defined in descriptor.proto) containing all of
the input files to FILE.
--include_imports When using --descriptor_set_out, also include
all dependencies of the input files in the
set, so that the set is self-contained.
--include_source_info When using --descriptor_set_out, do not strip
SourceCodeInfo from the FileDescriptorProto.
This results in vastly larger descriptors that
include information about the original
location of each decl in the source file as
well as surrounding comments.
--dependency_out=FILE Write a dependency output file in the format
expected by make. This writes the transitive
set of input file paths to FILE
--error_format=FORMAT Set the format in which to print errors.
FORMAT may be 'gcc' (the default) or 'msvs'
(Microsoft Visual Studio format).
--print_free_field_numbers Print the free field numbers of the messages
defined in the given proto files. Groups share
the same field number space with the parent
message. Extension ranges are counted as
occupied fields numbers.
--plugin=EXECUTABLE Specifies a plugin executable to use.
Normally, protoc searches the PATH for
plugins, but you may specify additional
executables not in the path using this flag.
Additionally, EXECUTABLE may be of the form
NAME=PATH, in which case the given plugin name
is mapped to the given executable even if
the executable's own name differs.
--grpc_python_out=OUT_DIR Generate Python source file.
--python_out=OUT_DIR Generate Python source file.
#<filename> Read options and filenames from file. If a
relative file path is specified, the file
will be searched in the working directory.
The --proto_path option will not affect how
this argument file is searched. Content of
the file will be expanded in the position of
#<filename> as in the argument list. Note
that shell expansion is not applied to the
content of the file (i.e., you cannot use
quotes, wildcards, escapes, commands, etc.).
Each line corresponds to a single argument,
even if it contains spaces.
For those seeking a simple & fast example to follow this is what worked for me:
python3 -m grpc_tools.protoc --proto_path=/home/estathop/tf --python_out=/home/estathop/tf --grpc_python_out=/home/estathop/tweetf0rm ASD.proto
I used python 3, defined the absolute proto_path to the folder the .proto file is present, I also included in python_out to save the outcome of this one-line-execution in the same folder as absolute path which will be the ASD_pb2.py. In addition, the grpc_python_out wants also an absolute path on where to save the ASD_pb2_grpc.py And Finally, wrote ASD.proto which is the .proto file I want to include that can be found in the current active directory of the command line window.
This question already has answers here:
How to check type of files without extensions? [duplicate]
(10 answers)
Closed 6 years ago.
Most of the time when we create a new text file with gedit in linux then the file is not saved with an extension of .txt for text file.So how will I recognise it with django code because here I can't check file extension.Here is my code...
Let's say i have a resume field for each user in following models.py
class User(AbstractUser):
resume= models.FileField( upload_to=get_attachment_file_path,default=None, null=True,validators=[validate_file_extension])
Now i want to Validate the file for allowed extension so I made a
validators.py as below
def validate_file_extension(fieldfile_obj):
megabyte_limit = 5.0
filesize = sys.getsizeof(fieldfile_obj)
ext = os.path.splitext(fieldfile_obj.name)[1]
print("extensionnnnnnnnnnnnn",ext)
valid_extensions = ['.pdf', '.doc', '.docx', '.jpg', '.png', '.xlsx', '.xls','.txt','.odt']
if not ext.lower() in valid_extensions:
raise ValidationError(u'Unsupported file extension.')
elif filesize > megabyte_limit*1024*1024:
raise ValidationError("Max file size is %s Byte" % str(megabyte_limit))
Now whenever I upload a text file in my api then it says unsupported file type because the code is unable to get the extension of linux text file.So how can i recognise that text file which is not saved as demo.txt instead my text file is saved as only demo but it is text file as seen from property of that file.
Also my next question is to get the size of each file uploaded in that FileField.I am using PostgreSQL as Dbms
You probably want to detect the upload's MIME type regardless of file extension, and that's often done by reading the file header to detect "magic numbers" or other bit patterns indicating the true nature of a file. Often text files are an edge case, where no header is detected and the first x bytes are printable ASCII or Unicode.
While that's a bit of a rabbit hole to dive into, there's a few Python libraries that will do that for you. For example: https://github.com/ahupp/python-magic will work for your needs by simply inferring the mime type per the file contents, which you will then match against the types you want to accept.
A somewhat related set of example code specific to your needs can be found here: https://stackoverflow.com/a/28306825/7341881
Edit: Eddie's solution is functionality equivalent; python-magic wraps libmagic, which is what Linux's native "file" command taps into. If you do decide to go the subprocess route, do be extra careful you're not creating a security vulnerability by improperly sanitizing user input (eg the user's provided filename). This could lead to an attack granting arbitrary access to your server's runtime environment.
Easy 3 line solution with no external dependencies.
import subprocess
file_info = subprocess.getoutput('file demo')
print(file_info)
In POSIX systems (Linux, Unix, Mac, BSD etc) you can use a file command, for example file demo will display the file info even if the file extension is not explicitly set.
demo is the argument for the file command in other words the actual file you are trying to detect.
Disclaimer, be extra careful running external commands.
Please follow this link for more info about the Python subprocess module.
https://docs.python.org/3.6/library/subprocess.html
I'm receiving an IOError when trying to create a file using open() in python, which only seems to occur for a single filename. The directories definitely exist and permissions are granted, the loop created around 1000 files successfully. When epic = "CON" in the code below I receive the "No such file or directory" error, but it works fine for other values.
f = open('data\\LSE\\%s.csv' % epic.strip(),'w')
f.write(u.read())
f.close()
Could this be a race issue? The files are created quite quickly.
I'm new to python so if there's something obvious I missed, apologies!
The problem is that you are running this code on Windows, which still contains some legacies from MS-DOS 1.0. CON is a special name for the console device. You can't use it as a file name. The earliest versions of MS-DOS did not support directories, nor did they support the so-called "extension" of the 8.3 file naming pattern. As a result, the name is special regardless of the directory and regardless of extension.
Some references:
http://blogs.msdn.com/b/oldnewthing/archive/2003/10/22/55388.aspx
https://superuser.com/questions/86999/unable-to-rename-a-folder-or-a-file-as-con
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
Do not use the following reserved names for the name of a file:
CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended.
The file extension is typically everything after the last period. If a filename has no ".", it has no extension. What happens when the filename begins with a dot, as hidden files in linux do?
In python, the file has no extension...
>>> os.path.splitext("base.ext")
('base', '.ext')
>>> os.path.splitext(".ext")
('.ext', '')
The common method in bash produces the other result where there is only an extension and no base part (Extract filename and extension in Bash)...
>>> filename=".ext"
>>> extension="${filename##*.}"
>>> base="${filename%.*}"
>>> echo $base
>>> echo $extension
ext
How should code handle filenames such as this? Is there a standard? Does it differ per operating system? Or simply which is most common/consistent?
[EDIT]
Lets say you have a file that's just ".pdf". Should, for example, an open dialogue default to listing it without 1. showing hidden files and 2. allowing all file extensions?
It's a hidden file - it begins with a period
Is it actually a .pdf (by filename convention, sure it has pdf data) or is it a file witn no extension?
File extensions in POSIX-based operating systems have no innate meaning; they're just a convention. Changing the extension wouldn't change anything about the file itself, just the name used to refer to it.
A file could have multiple extensions:
source.tar.gz
Sometimes a single extension represents a contraction of two:
source.tgz
Other files may not have an extension at all:
.bashrc
README
ABOUT
TODO
Typically, the only thing that defines an extension is that it is a trailing component of a filename that follows a non-initial period. Meaning is assigned by the application examining the file name. A PDF reader may focus on files whose names end with .pdf, but it should not refuse to open a valid PDF file whose name does not.
Note that
extension="${filename##*.}"
is simply an application of a parameter expansion operator which only returns the (final) extension if the filename does not start with a period. It's not an extension operator, it is a prefix-removal operator.