SimpleFSDirectory not working in python lucene

SimpleFSDirectory not working in python lucene - python

import sys, os
import lucene
from java.io import File
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.index import DirectoryReader
from org.apache.lucene.queryparser.classic import QueryParser
from org.apache.lucene.document import Document, Field
from org.apache.lucene.index import IndexWriter, IndexWriterConfig
from org.apache.lucene.store import SimpleFSDirectory
from org.apache.lucene.util import Version
def index(start, no, dom):
lucene.initVM()
# join base dir and index dir
path = raw_input("Path for index: ")
index_path = File(path)
directory = SimpleFSDirectory(index_path) # the index
I keep having errors with the SimpleFSDirectory, even when I tried other things like directory = SimpleFSDirectory(File(os.path.abspath("paths")))
InvalidArgsError: (, 'init', (,))

As of 5.0, the SimpleFSDirectory ctor no longer takes a File argument, it takes a Path. You can convert to a Path, with File.toPath().
Also, I would recommend you use FSDirectory.open, instead. FSDirectory.open allows lucene to try to choose the best Directory implementation for the current environment, which will generally perform better than a SimpleFSDirectory.
So, something like:
index_path = File(path).toPath()
directory = FSDirectory.open(index_path)

Related

Import custom modules in azure python function

I am building a python function I and I need to add my custom module in the code. My azure function looks like this:
import logging
import sys
from sys import path
import os
import time
sys.path.insert(0, './myFunctionFolderName') // Try to insert it as a standard python library.
import azure.functions as func
from myCustomModule.models import someFunctionName
def main(req: func.HttpRequest) -> func.HttpResponse:
name = "My new function"
testing_variable = someFunctionName()
return func.HttpResponse(f"Function executed {name}")]
What I have done is inserted my function folder as a standard path so python looks for the libraries in that folder as well. This function works perfectly in the local environment using Visual Studio Code. However, when I deploy and run it, it throws myCustomModule not found. Another piece of information is that I cannot access Kudu (terminal) since it is not suppored on my consumption plan for Linux. I am not sure what am I missing. Please not since its not a standard library I cannot add it in the requirements.txt. Also note, that my function folder has my function script file and my custom module so they are in the same folder as it should be in the azure python template.

Use an absolute path rather than the relative paths.
Following worked for me
import logging
import sys
from sys import path
import os
import time
dir_path = os.path.dirname(os.path.realpath(__file__))
sys.path.insert(0, dir_path)
import azure.functions as func
from myCustomModule.models import someFunctionName
def main(req: func.HttpRequest) -> func.HttpResponse:
name = "My new function"
testing_variable = someFunctionName()
return func.HttpResponse(f"Function executed {name}")]

A simpler solution than this response is to explicitly tell python to use local import adding a dot before the custom module import, like this:
import logging
import sys
from sys import path
import os
import time
import azure.functions as func
from .myCustomModule.models import someFunctionName
It works with Azure Functions

See also the offical documentation regarding Import Behavior in Functions for Python.

Import all files in current directory

I have just started a python project. The directory structure is as follows:
/algorithms
----/__init__.py
----/linkedlist
--------/__init__.py
--------/file1.py
--------/file2.py
/tests
----/test_linkedlist
You can also check the Github repository.
In each of the sub folders under algorithms, in the __init__ file I am including the following for all the files one by one:
from .file1 import *
from .file2 import *
And so on.
The task that I am trying to achieve is running all tests together using the query:
python3 -m unittest discover tests
Each file in the tests directory starts as follows:
from algorithms.linkedlist import *
import unittest
Right now if I want to add a new file to the linkedlist directory, I create the file and then add another from .filename import * in the __init__ file.
How do I write a script in the __init__ file so that each time I create a new file, I do not have to manually insert the import command?

So the __init__ is in the same folder? As the docs say The import statement is syntactic sugar for the __import__ function.
So we can use:
import importlib
import glob
for file in glob.iglob('*.py'):
importlib.__import__(file)
Some reasons why this does not work:
You want to load the functions in the module - the import * from syntax. With this code you can only run file1.test.
You run the script loading from another directory, which confuses glob. We have to specify the actual working directory.
__import__ prefers to know the module name.
To find the solution I combine the import * from function from this answer with pkgutil.walk_packages from this blog.
import importlib
import pkgutil
def custom_import_all(module_name):
""" Use to dynamically execute from module_name import * """
# get a handle on the module
mdl = importlib.import_module(module_name)
# is there an __all__? if so respect it
if "__all__" in mdl.__dict__:
names = mdl.__dict__["__all__"]
else:
# otherwise we import all names that don't begin with _
names = [x for x in mdl.__dict__ if not x.startswith("_")]
# now drag them in
globals().update({k: getattr(mdl, k) for k in names})
__path__ = pkgutil.extend_path(__path__, __name__)
for importer, modname, ispkg in pkgutil.walk_packages(path=__path__, prefix=__name__+'.'):
custom_import_all(modname)

distutils wildcard entry for include_dir

I am setting up a package using distutils.
I need to allow access to a module that is built during the set-up process and is located in ./build/temp.linux-x86_64-3.6. I do this by including the
include_dirs=["./build/temp.linux-x86_64-3.6"]
when adding the extension to the distutils Configuration.
My question is there a way of setting this using a wildcard such as:
include_dirs=["./build/temp.linux*"]
as when I try this it fails, citing error:
Nonexistent include directory ‘build/temp.linux*’ [-Wmissing-include-dirs]
The reason I want this is that the build folder will be named differently depending on the system. Alternatively if anyone knows a way of figuring out what this temp build folder will be called that would also work.

The way I have got around this problem is as follows:
def return_major_minor_python():
import sys
return str(sys.version_info[0])+"."+str(sys.version_info[1])
def return_include_dir():
from distutils.util import get_platform
return get_platform()+'-'+return_major_minor_python()
Then when calling config.add_extension() using:
include_dirs=['build/temp.' + return_include_dir()]
So the whole process for adding a f90wrapped, f2py extension to a python package is:
def setup_fort_ext(args,parent_package='',top_path=''):
from numpy.distutils.misc_util import Configuration
from os.path import join
import sys
config = Configuration('',parent_package,top_path)
fort_src = [join('PackageName/','fortran_source.f90')]
config.add_library('_fortran_source', sources=fort_src,
extra_f90_compile_args = [ args["compile_args"]],
extra_link_args=[args["link_args"]])
sources = [join('PackageName','f90wrap_fortran_source.f90')]
config.add_extension(name='_fortran_source',
sources=sources,
extra_f90_compile_args = [ args["compile_args"]],
extra_link_args=[args["link_args"]],
libraries=['_tort'],
include_dirs=['build/temp.' + return_include_dir()])
return config
if __name__ == '__main__':
import sys
import subprocess
import os
install_numpy() #installs numpy
install_dependencies() #calls to pip to install any requirements
from numpy.distutils.core import setup
config = {'name':'PackageName',
'version':__version__,
'project_description':'Some Package description',
'description':'Some package Description',
'long_description': open('README.txt').read(),
'long_description_content_type':'text/markdown',
'author':'Your name here',
'author_email':'your email here',
'url':'link to git repo here',
'python_requires':'>=3.3',
'packages':['PackageName'],
'package_dir':{'PackageName':'PackageName'},
'package_data':{'PackageName':['*so*']},
'name': 'PackageName'
}
config_fort = setup_fort_ext(args,parent_package='PackageName',top_path='')
config2 = dict(config,**config_fort.todict())
setup(**config2)
where the source fortran_source.f90 is wrapped beforehand and the resulting wrapped source file (f90wrap_fortran_source.f90) is included as a library, and subsequently compiled by f2py.
args in the above is just a dict with the any linking or compile args you wish to pass through.

How do I import all functions from a package in python?

I have a directory as such:
python_scripts/
test.py
simupy/
__init__.py
info.py
blk.py
'blk.py' and 'info.py are modules that contains several functions, one of which is the function 'blk_func(para)'.
Within '__init__.py' I have included the following code:
import os
dir_path = os.path.dirname(os.path.realpath(__file__))
file_lst = os.listdir(dir_path)
filename_lst = list(filter(lambda x: x[-3:]=='.py', file_lst))
filename_lst = list(map(lambda x: x[:-3], filename_lst))
filename_lst.remove('__init__')
__all__ = filename_lst.copy()
I would like to access the function 'blk_func(para)', as well as all other functions inside the package, within 'test.py'. Thus I import the package by putting the following line of code in 'test.py':
from simupy import*
However, inorder to use the function, I still have to do the following:
value = blk.blk_func(val_param)
How do I import the package simupy, such that I can directly access the function in 'test.py' by just calling the function name? i.e.
value = blk_func(val_para)

Pretty easy
__init__.py:
from simupy.blk import *
from simupy.info import *
Btw, just my two cents but it looks like you want to import your package's functions in __init__.py but perform actions in __main__.py.
Like
__init__.py:
from simupy.blk import *
from simupy.info import *
__main__.py:
from simupy import *
# your code
dir_path = ....
It's the most pythonic way to do. After that you will be able to:
Run your script as a proper Python module: python -m simupy
Use your module as library: import simupy; print(simupy.bar())
Import only a specific package / function: from simupy.info import bar.
For me it's part of the beauty of Python..

Module undefined after importing it in a function

I'm trying to use a function called start to set up my enviroment in python. The function imports os.
After I run the function and do the following
os.listdir(simdir+"main")
I get a error that says os not defined
code
>>> def setup ():
import os.path
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
>>> setup()
>>> os.listdir(simdir+"main")
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
os.listdir(simdir+"main")
NameError: name 'os' is not defined

The import statement is scoped. When importing modules they are defined for the local namespace.
From the documentation:
Import statements are executed in two steps: (1) find a module, and initialize it if necessary; (2) define a name or names in the local namespace (of the scope where the import statement occurs). [...]
So in your case the os package is only defined within function setup.

You are getting this error because you are NOT importing the whole os library but just the os.path module. In this way, the other resources at the os library are not made available for your use.
In order to be able to use the os.listdir method, you need to either import it alongside the os.path like this:
>>> def setup ():
import os.path, os.listdir
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
or import the full library:
>>> def setup ():
import os
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
You can read more here:
https://docs.python.org/2/tutorial/modules.html

try:
import os.path
import shutil
import glob
def setup ():
global simdir
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
setup()
os.listdir(simdir+"main")

You need to return the paths and assign the returned values in the global scope. Also, import os too:
import os
def setup():
# retain existing code
return simdir, maindir
simdir, maindir = setup()

When you import os or do any sort of command within a function, the command's effect only last while that function itself is running. What you need to do is
import os
...Do your function and other code
This way, your import lasts for the whole program :).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

SimpleFSDirectory not working in python lucene - python

Related

Import custom modules in azure python function

Import all files in current directory

distutils wildcard entry for include_dir

How do I import all functions from a package in python?

Module undefined after importing it in a function

Categories

Resources