I'm getting an error message that i'm unable to tackle. I don't get what's the issue with the multiprocessing library and i don't understand why it says that it is impossible to import the build_database module but in the same time it executes perfectly a function from that module.
Could somebody tell me is he sees something. Thank you.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
Traceback (most recent call last):
File "<string>", line 1, in <module>
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
File "C:\Python27\lib\multiprocessing\forking.py", line 495, in prepare
prepare(preparation_data)
'__parents_main__', file, path_name, etc
File "C:\Python27\lib\multiprocessing\forking.py", line 495, in prepare
File "C:\Users\Comp3\Desktop\User\Data\main.py", line 4, in <module>
'__parents_main__', file, path_name, etc
import database.build_database
File "C:\Users\Comp3\Desktop\User\Data\main.py", line 4, in <module>
ImportError : import database.build_database
NImportErroro module named build_database:
No module named build_database
This is what i have in my load_bigquery.py file:
# Send CSV to Cloud Storage
def load_send_csv(table):
job = multiprocessing.current_process().name
print '[' + table + '] : job starting (' + job + ')'
bigquery.send_csv(table)
#timer.print_timing
def send_csv(tables):
jobs = []
build_csv(tables)
for t in tables:
if t not in csv_targets:
continue
print ">>>> Starting " + t
# Load CSV in BigQuery, as parallel jobs
j = multiprocessing.Process(target=load_send_csv, args=(t,))
jobs.append(j)
j.start()
# Wait for jobs to complete
for j in jobs:
j.join()
And i call it like this from my main.py :
bigquery.load_bigquery.send_csv(tables)
My folder is like this:
src
| main.py
|
├───bigquery
│ │ bigquery.py
│ │ bigquery2.dat
│ │ client_secrets.json
│ │ herokudb.py
│ │ herokudb.pyc
│ │ distimo.py
│ │ flurry.py
│ │ load_bigquery.py
│ │ load_bigquery.pyc
│ │ timer.py
│ │ __init__.py
│ │ __init__.pyc
│ │
│ │
├───database
│ │ build_database.py
│ │ build_database.pyc
│ │ build_database2.py
│ │ postgresql.py
│ │ timer.py
│ │ __init__.py
│ │ __init__.pyc
That function works perfectly if i execute load_bigquery.py alone but if i import it into main.py it fails with the errors given above.
UPDATE :
Here are my import, maybe it might help:
main.py
import database.build_database
import bigquery.load_bigquery
import views.build_analytics
import argparse
import getopt
import sys
import os
load_bigquery.py
import sys
import os
import subprocess
import time
import timer
import distimo
import flurry
import herokudb
import bigquery
import multiprocessing
import httplib2
bigquery.py
import sys
import os
import subprocess
import json
import time
import timer
import httplib2
from pprint import pprint
from apiclient.discovery import build
from oauth2client.file import Storage
from oauth2client.client import AccessTokenRefreshError
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.client import flow_from_clientsecrets
from oauth2client.tools import run
from apiclient.errors import HttpError
Maybe the issue is with the fact that load_bigquery.py imports multiprocessing and then main.py imports load_bigquery.py ?
You are probably missing the __init__.py inside src/bigquery/. So your source folders should be:
> src/main.py
> src/bigquery/__init__.py
> src/bigquery/load_bigquery.py
> src/bigquery/bigquery.py
The __init__.py just needs to be empty and is only there so that Python knows that bigquery is a Python package.
UPDATED: Apparently the __init__.py file is present. The actual error message talks about a different error, which is it cannot import database.build_database.
My suggestion is to look into that. It is not mentioned as being in the src folder...
UPDATE 2: I think you have a clash with your imports. Python 2 has a slightly fuzzy relative import, which sometimes catches people out. You have both a package at the same level of main.py called database and one inside bigquery called database. I think somehow you are ending up with the one inside bigquery, which doesn't have build_database. Try renaming one of them.
Related
I was hoping someone could help me figure out an odd "dependency" problem. I have a fairly large python project, with a slimmed down structure that looks like:
Sitka
│ DataTickers.py
│ example.csv
│ FinDates.py
│ SitkaMongo.py
│ tickers_csv.csv
│ __init__.py
│
├───Fin
│ │ main.py
│ │ md_provider_control.py
│ │ Tofino.py
│ │ __init__.py
│ │
│ │
│ ├───Instruments
│ │ │ market_standard_instruments.py
│ │ └ __init__.py
│ │
│ ├───Env
│ │ │ CurveClass.py
│ │
│ ├───Utils
│ │ charting.py
│ │ exchange_identifier_mapper.py
│ │ fin_mapper.py
│ │ md_provider_simulation.py
│ └ __init__.py
Tofino.py:
from .Env.CurveClass import CurveData as _CurveData
class Tofino():
def __init__(self, mdp, VAL_ENV = None):
mdp.tofino = self # link Tofino
# Public VE Refernce
self.val_env = VAL_ENV
self.ir_config = VAL_ENV.market
market_standard_instruments.py:
# Standard Imports
import Sitka.FinDates as fdate
import datetime as dt
import re
from itertools import product
# bunch of functions after this.
CurveClass.py:
import pandas as pd
import datetime as dt
from dateutil.relativedelta import relativedelta
class CurveData():
def __init__(self):
self.do_stuff= self._stuff()
main.py
from Sitka.FinDates import getMainDates
# Sitka- Custom Imports
from .md_provider_control import MD_ProviderV3
from .Tofino import Tofino
import Sitka.Fin.Instruments.market_standard_instruments as mkt_std
def main() -> Tofino:
# < ---- do a bunch of stuff ---- >
return Tofino(mdp = mdp, VAL_ENV=ve.GLOBAL_VALN_ENV)
And lastly, Sitka.Fin.__ init __.py:
import logging
import traceback
# Run Valuation Environment Startup
from .main import main
# Global Variables:
from .Tofino import Tofino as _Tofino
tofino : _Tofino
tofino = None
try:
tofino = main() # I was trying some stuff out here, hence the weird traceback in try
except:
print(traceback.format_exc())
My issue is, after all that, is when I run import Sitka.Fin as fin, this line in main.py
import Sitka.Fin.Instruments.market_standard_instruments as mkt_std
fires off the Sitka.Fin__init__ process again before we even get to the try block (so init basically runs 2x).
Any help is appreciated!
P.S. Basically I'm just including subfolder init's because its the only way I know how to get Intellsense/autocomplete in the IDE to work nicely... I would love to know how to make my code 'cleaner' from that sense.
Edit:
A simpler way to look at the problem. Lets say I open a new IPython console, and only do:
import Sitka.Fin.Instruments.market_standard_instruments as mkt_std
Simply doing this kicks off the entire Sitka.Fin.__init__ procedure [which I wouldn't have expected]
It seems you only want some code of the main.py to run when the file itself is running. Try using:
if __name__ in "__main__": # All sikta imports
from Sitka.FinDates import getMainDates
from .md_provider_control import MD_ProviderV3
from .Tofino import Tofino
import Sitka.Fin.Instruments.market_standard_instruments as mkt_std
So I'm working on a rather big Python 3 project where I'm writing unit tests for some of the files using the unittest library. In one unit test, the tested file imports a function from another python package whose __init__ file itself imports from the tested file. This leads to an ImportError during the unit test.
It is desired for the __init__.py to import from the tested file periphery\foo.py, so I would like to know if there is a possibility to make the unit test work without removing the import from __init__.py
Since the project contains a lot of files, I created a minimal example that illustrates the structure and in which the error can be reproduced. The project structure looks like this
├───core
│ bar.py
│ __init__.py
│
├───periphery
│ foo.py
│ __init__.py
│
└───tests
│ __init__.py
│
├───core
│ __init__.py
│
└───periphery
test_foo.py
__init__.py
The init file in core core/__init__.py contains the code
# --- periphery ---
from periphery.foo import Foo
while periphery/foo.py, which is the file to be tested, looks like
from core.bar import Bar
def Foo():
bar = Bar()
return bar
Finally, the unit test has the following structure
from unittest import TestCase
from periphery.foo import Foo
class Test(TestCase):
def test_foo(self):
""" Test that Foo() returns "bar" """
self.assertEqual(Foo(), "bar")
Running the unit test yields the following error:
ImportError: cannot import name 'Foo' from partially initialized module 'periphery.foo' (most likely due to a circular import)
I have faced a rather famous issue while importing my python modules inside the project.
This code is written to replicate the existing situation:
multiply_func.py:
def multiplier(num_1, num_2):
return num_1 * num_2
power_func.py:
from math_tools import multiplier
def pow(num_1, num_2):
result = num_1
for _ in range(num_2 - 1):
result = multiplier(num_1, result)
return result
The project structure:
project/
│ main.py
│
└─── tools/
│ __init__.py
│ power_func.py
│
└─── math_tools/
│ __init__.py
│ multiply_func.py
I've added these lines to __init__ files to make the importing easier:
__init__.py (math_tools):
from .multiply_func import multiplier
__init__.py (tools):
from .power_func import pow
from .math_tools.multiply_func import multiplier
Here is my main file.
main.py:
from tools import pow
print(pow(2, 3))
Whenever I run it, there is this error:
>>> ModuleNotFoundError: No module named 'math_tools'
I tried manipulating the sys.path, but I had no luck eliminating this puzzling issue. I'd appreciate your kind help. Thank you in advance!
You messed it up in the "power_func.py" file.
You have to use . before math_tools to refer to the current directory module.
Update "power_func.py" as bellow, it works perfectly.
from .math_tools import multiplier
My Project looks like this:
├─outer_module
│ │ __init__.py
│ │
│ └─inner_module
│ a.py
│ b.py
├─test.py
__init__.py:
from outer_module.inner_module import a
from outer_module.inner_module import b
a.py:
instance_a = 1
b.py:
instance_b = 1
print("instance_b created!")
test.py:
from outer_module.inner_module import a
I want to shorten import path in test.py, i.e. use from outer_module import a. That is not unusual when I turn my project into a release module. But using __init__.py, it will automatically invoke b.py and print instance_b created!. Seperating a.py and b.py from inner_module is not recommended because they are functionally similar. Other .py file may invoke b.py so b.py must appear in __init__.py
Could anyone give some advice?
In your __init__.py file try importing the a and b files individually and adding the two imports to your __all__ variable.
from outer_module.inner_module import a
from outer_module.inner_module import b
__all__ = [
'a',
'b',
]
Now you can import a and b directly from outer_module.
from outermodule import a
import outermodule.b as b
I have this structure:
├── app
│ ├── __init__.py
│ └── views.py
├── requirements.txt
├── sources
│ └── passport
│ ├── field_mapping.
│ ├── listener.py
│ ├── main.py
this is my init file:
from flask import Flask
app = Flask(__name__)
from app import views
my views file. Is this the best way to send plain text?
from app import app
from flask import Response
from sources.app_metrics import meters
# from sources.passport.main import subscription_types
#app.route('/metrics')
def metrics():
def generateMetrics():
metrics = ""
for subscription in ["something", "some other thing"]:
metrics += "thing_{}_count {}\n".format(subscription, meters[subscription].get()['count'])
return metrics
print(generateMetrics())
return Response(generateMetrics(), mimetype='text/plain')
My sources/passport/main file looks like this:
subscription_types = ["opportunity", "account", "lead"]
if __name__ == "__main__":
loop = asyncio.get_event_loop()
...
for subscription in subscription_types():
I also ran export FLASK_ENV=app/__init__.py before running flask app
When I visit /metrics I get an error that looks like some kind of circular dependency error.
When I uncomment that import comment in my views, file, the error occurs.
Pulling out subscription_types into a variable and importing it seems to be causing the problem.
My stack trace:
File "/usr/local/lib/python3.7/site-packages/flask/cli.py", line 235, in locate_app
__import__(module_name)
File "/Users/jwan/extract/app/__init__.py", line 5, in <module>
from app import views
File "/Users/jwan//extract/app/views.py", line 5, in <module>
from sources.passport.main import subscription_types
File "/Users/jwan/extract/sources/passport/main.py", line 3, in <module>
from sources.passport.listener import subscribe, close_subscriptions
File "/Users/jwan/extract/sources/passport/listener.py", line 18, in <module>
QUEUE = boto3.resource("sqs").get_queue_by_name(QueueName=CONFIG["assertions_queue"][ENV])
botocore.errorfactory.QueueDoesNotExist: An error occurred (AWS.SimpleQueueService.NonExistentQueue) when calling the GetQueueUrl operation: The specified queue does not exist for this wsdl versio
My sources/passport/listener file has this on line 18:
import gzip
import log
from os import getenv
from sources.passport.normalizer import normalize_message
from sources.app_metrics import meters
QUEUE = boto3.resource("sqs").get_queue_by_name(QueueName=CONFIG["assertions_queue"][ENV])