Our application is one of the few left running on DEA. On DEA we were able to use a specific custom buildbpack:
https://github.com/ihuston/python-conda-buildpack
Now that we have to move on Diego runtime, we run out of space while pushing the app. I believe the disk space is only required during staging, because quite a few libraries are coming with the buildpack and have to be built (we need the whole scientific python stack, which is all included in the above buildpack).
The build script outputs everything fine, except that the app cannot start. The logs then show:
2016-10-13T19:10:42.29+0200 [CELL/0] ERR Copying into the container failed: stream-in: nstar: error streaming in: exit status 2. Output: tar: ./app/.conda/pkgs/cache/db552c1e.json: Wrote only 8704 of 10240 bytes
and further many files:
2016-10-13T19:10:42.29+0200 [CELL/0] ERR tar: ./app/.conda/pkgs/cache/9779607c273dc0786bd972b4cb308b58.png: Cannot write: No space left on device
and then
2016-10-13T20:16:48.30+0200 [API/0] OUT App instance exited with guid b2f4a1be-aeda-44fa-87bc-9871f432062d payload: {"instance"=>"", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"Copying into the container failed", "crash_count"=>14, "crash_timestamp"=>1476382608296511944, "version"=>"ca10412e-717a-413b-875a-535f8c3f7be4"}
When trying to add more disk quota (above 1G) there is an error:
Server error, status code: 400, error code: 100001, message: The app is invalid: disk_quota too much disk requested (must be less than 1024)
Is there a way to give a bit more space? At least for the build process?
You can use a .cfignore file just like a .gitignore file to exclude any unneeded files from being cf pushed. Maybe if you really only push what is necessary, the disk space could be sufficient.
https://docs.developer.swisscom.com/devguide/deploy-apps/prepare-to-deploy.html#exclude
The conda installer from https://github.com/ihuston/python-conda-buildpack installs by default with the Intel MKL library. Now this is usually a good thing, but seemingly uses too much space and thus cannot be deployed.
I adapted the buildpack and added to the line
$CONDA_BIN/conda install --yes --quiet --file "$BUILD_DIR/conda_requirements.txt"
The flag nomkl
$CONDA_BIN/conda install nomkl --yes --quiet --file "$BUILD_DIR/conda_requirements.txt"
As described in continuums blog post here:
https://www.continuum.io/blog/developer-blog/anaconda-25-release-now-mkl-optimizations
This will then use OpenBLAS instead and results in a much smaller droplet (175M instead of 330MB) and the deployment can successfully finish.
Related
I built an app using kivymd I converted it to apk using the following commands . After installing it in my phone , while opening it it force closes within few seconds .
1.buildozer init
2.nano buildozer.spec
(In order to change some stuffs like app's name)
3.Then the following are some dependencies for buildozer
a)sudo apt update
b)sudo apt install -y git zip unzip openjdk-8-jdk python3-pip autoconf libtool pkg-config zlib1g-dev libncurses5-dev libncursesw5-dev libtinfo5 cmake libffi-dev libssl-dev
c)pip3 install --user --upgrade Cython==0.29.19 virtualenv
d)export PATH=$PATH:~/.local/bin/
After the above commands finally I execute the following command:
4.buildozer -v android debug
I use kivymd because it has good material design
I've added a picture below it loads like this and comes to my home screen again.
Thanks in advance
One procedure you can do to get accurate information about the issue is, I guess your project has a "main.py" script, so make a copy of that script and rename it to, for example, "main2.txt", so it will be plain text (but it is your program), and then, edit the "main.py" script and insert the following code (watch my video https://youtu.be/LFQVhOzRlE0 ), please read the comments in the code:
try:
m=open("main2.txt").read()
exec(m)
except Exception as e:
n=open("/storage/emulated/0/The_error_is_jbsidis.txt","w")
#the path may be different from device to device,
#sometimes is /sdcard/, for the app to have access
#to the external storage in the mobile device you
#should add the READ_EXTERNAL_STORAGE permission
#in the spec file, and then grant the permission manually
#from the app manager from your cellphone,
#this will help the apk, to be able to write content
#to the device, then you can read it and see why the
#app closes itself or what is throwing the error,
#watch my video https://youtu.be/LFQVhOzRlE0
#a fun fact is that if the "The_error_is_jbsidis.txt" file is empty (sometimes the app may crash before python can execute your main.py), that means that the issue is in the compilation itself or there is a missing module in your apk spec file #jbsidis
n.write(str(e))
n.close()
Remember to add the permissions in the spec file, otherwise the app will not have permission granted to read the storage of your device and won't be able to write the error before the app crashes, for more info watch my video: https://youtu.be/LFQVhOzRlE0
I am using elastic beanstalk to deploy my Django application. Today it suddenly stopped working without any breaking changes from the application side (I've changed some templates, nothing more).
The deployment time outs after 10 minutes of trying to deploy the app and nothing happens.
The only more or less useful hints I can see in the log is this:
[2020-02-20T15:00:20.437Z] INFO [19057] - [Application update .../postbuild_0_myproject/Command 01_migrate] : Activity execution failed, because: SystemCheckError: System check identified some issues:
ERRORS:
education.Author.photo: (fields.E210) Cannot use ImageField because Pillow is not installed.
HINT: Get Pillow at https://pypi.org/project/Pillow/ or run command "pip install Pillow".
education.Course.cover_image: (fields.E210) Cannot use ImageField because Pillow is not installed.
HINT: Get Pillow at https://pypi.org/project/Pillow/ or run command "pip install Pillow".
education.CourseCategory.icon_image: (fields.E210) Cannot use ImageField because Pillow is not installed.
HINT: Get Pillow at https://pypi.org/project/Pillow/ or run command "pip install Pillow".
Using staging settings
App receivers connected
(ElasticBeanstalk::ExternalInvocationError)
[2020-02-20T15:00:20.437Z] INFO [19057] - [Application update .../postbuild_0_myproject/Command 01_migrate] : Activity failed.
[2020-02-20T15:00:20.437Z] INFO [19057] - [Application update .../postbuild_0_myproject] : Activity failed.
[2020-02-20T15:00:20.437Z] INFO [19057] - [Application update ...] : Activity failed.
[2020-02-20T15:00:20.507Z] INFO [19057] - [Application update app-9a24-200220_145942-stage-200220_145942#142/AppDeployStage0/EbExtensionPostBuild] : Activity failed.
[2020-02-20T15:00:20.507Z] INFO [19057] - [Application update app-9a24-200220_145942-stage-200220_145942#142/AppDeployStage0] : Activity failed.
[2020-02-20T15:00:20.508Z] INFO [19057] - [Application update app-9a24-200220_145942-stage-200220_145942#142] : Completed activity. Result:
Application update - Command CMD-AppDeploy failed
But I already have Pillow in requirements.txt and the log above says:
Requirement already satisfied: Pillow==6.2.1 in /opt/python/run/venv/lib64/python3.6/site-packages (from -r /opt/python/ondeck/app/requirements.txt (line 51))
How can I troubleshoot and fix this? And how can I avoid similar issues in the future? I am really frightened that the same problem may randomly pop out on production environment.
Here's some more info about the configuration:
Here's what I have in .ebextensions:
01_packages.config:
packages:
yum:
git: []
postgresql93-devel: []
db-migrate.config
container_commands:
01_migrate:
command: "django-admin.py migrate"
leader_only: true
option_settings:
aws:elasticbeanstalk:application:environment:
DJANGO_SETTINGS_MODULE: myproject.settings
django.config
option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: myproject/wsgi.py
wsgi_custom.config
files:
"/etc/httpd/conf.d/wsgihacks.conf":
mode: "000644"
owner: root
group: root
content: |
WSGIPassAuthorization On
This one is a pain and a known issue with Django when using the ImageField model/form. Due to Pythons dynamic import system it will suddenly appear and annoyed the hell out of me when I first came across it.
The way I normally fix this is by using conda and its equivalent of a virtualenv to ensure the right interpreter (the one with my packages) is used.
If you are not using a virtualenv or equivalent, set one up now, if you are already using one then check you are installing pillow with pip3 install pillow - the pip3 being important here as on debian (and many other) systems normal pip will only install for python 2.x.
Using conda will ensure this doesnt happen in production, but I would also add it to your checklist of things to test when deploying - check correct version of pillow setup and working.
I had two Elastic Beanstalk environments with the same issue (one web tier env and a worker env).
On one of them the issue was resolved by restarting the environment.
The other one failed to restart and timed out every time on any operation. This one I managed to fix by going to configuration > capacity and changing the minimum and maximum number of instances to 0. I've applied the changes, waited for them to apply and then returned the previous values for min and max instance numbers.
That fixed the issue.
I still have no idea what caused the issue in the first place and would love to receive some comment on that.
EDIT: As Rekovni pointed out, using a GitLab runner with Docker on a Windows machine is a problem. Installing the runner in a Linux-based virtual machine solved the problem.
I am developing a Python program using a conda environment. It is hosted on GitLab.com and I am using GitLab-CI to generate the documentation.
I configured the following .gitlab-ci.yml file for it:
image: continuumio/miniconda3:latest
before_script:
# Update conda and create environment, which is then activated.
- conda update -vvv -y -c conda-forge conda
- conda env create -f helpers/NAME.yml
- source activate NAME
# Correct installation.
- conda install -q -y gsl=2.2.1
pages:
script:
# Install make.
- apt-get update
- apt-get install -q -y build-essential
# Install Spinx-related packages.
- conda install -q -y sphinx sphinx_rtd_theme
# Create documentation.
- cd REPO/doc
- sphinx-apidoc -o source/ ../REPO --force --separate
- make html
# Transfer documentation to public pages folder.
- mv build/html/ ../../public/
artifacts:
paths:
- public
# only:
# - master
Running this script with a shared GitLab runner that is supplied with GitLab.com works and the documentation is generated and placed in the public folder.
For future unit tests (which take much longer), I want to provide a local runner on a Win 10 machine in my network. For this, I installed the gitlab-runner.exe and Docker Desktop. I successfully registered the runner with the project on GitLab.com.
The runner is using the following config.toml configuration file:
concurrent = 1
check_interval = 0
log_level = "info"
[session_server]
session_timeout = 1800
[[runners]]
name = "NAME"
url = "https://gitlab.com"
token = "TOKEN"
executor = "docker"
[runners.custom_build_dir]
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
The problem is now that the local runner freezes during the execution of the above script without producing any error messages and I am at a loss on how to debug it. What I have is
The log of the script that is shown on the Job page on GitLab.com; and
The console output of the gitlab-runner.exe on the local machine.
Regarding 1., I see
[0KRunning with gitlab-runner 11.10.0 (3001a600)
...
[32;1mChecking out COMMIT_HASH as BRANCH_NAME...[0;m
...
[0K[32;1m$ conda update -vvv -y -c conda-forge conda[0;m
DEBUG conda.gateways.logging:set_verbosity(148): verbosity set to 3
...
...
...
TRACE conda.gateways.disk.update:rename(52): renaming /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_new.html => /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_new.html.c~
TRACE conda.core.path_actions:execute(1041): renaming share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html => share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html.c~
TRACE conda.gateways.disk.update:rename(52): renaming /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html => /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html.c~
TRACE conda.core.path_actions:execute(1041): renaming share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_ctrl.html => share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_ctrl.html.c~
where it abruptly stops without reaching the - conda env create -f helpers/NAME.yml line.
Regarding 2., I see
C:\GitLab-Runner>gitlab-runner.exe --debug run
Runtime platform arch=amd64 os=windows pid=14116 revision=3001a600 version=11.10.0Starting multi-runner from C:\GitLab-Runner\config.toml ... builds=0
Checking runtime mode GOOS=windows uid=-1
Configuration loaded builds=0
...
Feeding runners to channel builds=0
Checking for jobs... nothing runner=TOKEN
Feeding runners to channel builds=0
Checking for jobs... received job=203033130 repo_url=REPO_URL.git runner=TOKEN
...
Attaching to container HASH ... job=203033130 project=6249897 runner=TOKEN
Starting container HASH ... job=203033130 project=6249897 runner=TOKEN
Waiting for attach to finish HASH ... job=203033130 project=6249897 runner=TOKEN
Waiting for container HASH ... job=203033130 project=6249897 runner=TOKEN
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-10348 job-status=running runner=TOKEN sent-log=1801-10347 status=202 Accepted
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-19445 job-status=running runner=TOKEN sent-log=10348-19444 status=202 Accepted
...
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-933150 job-status=running runner=TOKEN sent-log=241860-933149 status=202 Accepted
Submitting job to coordinator... ok code=200 job=203033130 job-status= runner=TOKEN
Submitting job to coordinator... ok code=200 job=203033130 job-status= runner=TOKEN
where it seems that the switch from Appending trace to coordinator to Submitting job to coordinator happens around the time when it gets stuck.
After this, 1. is not updated with any further information and 2. is stuck in a Submitting job to coordinator loop.
Does anyone know:
What the reason for the failure with a local runner could be (when the same script works with a shared runner)?
What I could do to debug this problem?
Thanks and all the best,
Thomas
GitLab CI doesn't currently offer a solution for using its runner with Docker on a Windows environment, however there is an epic at the moment which is tracking progress for this.
In one of the issues of the epic, a contributer has managed to get a working version of a gitlab-runner which uses Docker for Windows, with which more details can be found here.
A more common (and potentially easier) way of using Docker in a Windows environment, would be to install the gitlab-runner as a Shell runner, and call the Docker commands manually to run your tests.
Conversely, if you just want to keep using the same CI script, you could install a Linux VM on your Windows 10 machine, and have that host the docker runner!
We are building data pipeline using Beam Python SDK and trying to run on Dataflow, but getting the below error,
A setup error was detected in beamapp-xxxxyyyy-0322102737-03220329-8a74-harness-lm6v. Please refer to the worker-startup log for detailed information.
But could not find detailed worker-startup logs.
We tried increasing memory size, worker count etc, but still getting the same error.
Here is the command we use,
python run.py \
--project=xyz \
--runner=DataflowRunner \
--staging_location=gs://xyz/staging \
--temp_location=gs://xyz/temp \
--requirements_file=requirements.txt \
--worker_machine_type n1-standard-8 \
--num_workers 2
pipeline snippet,
data = pipeline | "load data" >> beam.io.Read(
beam.io.BigQuerySource(query="SELECT * FROM abc_table LIMIT 100")
)
data | "filter data" >> beam.Filter(lambda x: x.get('column_name') == value)
Above pipeline is just loading the data from BigQuery and filtering based on some column value. This pipeline works like a charm in DirectRunner but fails on Dataflow.
Are we doing any obvious setup mistake? anyone else getting the same error? We could use some help to resolve the issue.
Update:
Our pipeline code is spread across multiple files, so we created a python package. We solved setup error problem by passing --setup_file argument instead of --requirements_file.
We resolved this setup error issue by sending a different set of arguments to the dataflow. Our code is spread across multiple files, so had to create a package for it. If we use --requirements_file, the job will start, but fail eventually, because it wouldn't be able to find the package in the workers. Beam Python SDK sometimes does not throw explicit error message for these instead, it will retry the job and fail. To get your code running with a package, you will need to pass --setup_file argument, which has dependencies listed in it. Make sure package created by python setup.py sdist command includes all the files required by your pipeline code.
If you have a privately hosted python package dependency then pass --extra_package with the path to the package.tar.gz file. Better way is to store in a GCS bucket and pass the path here.
I have written an example project to get started with Apache Beam Python SDK on Dataflow - https://github.com/RajeshHegde/apache-beam-example
Read about it here - https://medium.com/#rajeshhegde/data-pipeline-using-apache-beam-python-sdk-on-dataflow-6bb8550bf366
I'm building a prediction pipeline using Apache Beam/Dataflow. I need to include the model files inside the dependencies available to the remote workers. The Dataflow job failed with the same error log:
Error message from worker: A setup error was detected in beamapp-xxx-xxxxxxxxxx-xxxxxxxx-xxxx-harness-xxxx. Please refer to the worker-startup log for detailed information.
However, this error message didn't give any details about the worker-startup log. Finally, I found a way to have the worker log and solve the problem.
As is known, Dataflow creates compute engines to run jobs and save logs on them so that we can access the vm to see logs. We can connect to the vm in use by our Dataflow job from the GCP console via SSH. Then we can check the boot-json.log file located in /var/log/dataflow/taskrunner/harness:
$ cd /var/log/dataflow/taskrunner/harness
$ cat boot-json.log
Here we should pay attention. When running in batch mode, the vm created by Dataflow is ephemeral and closed when the job failed. If the vm is closed, we can't access it anymore. But a process including a failing item is retried 4 times, so normally we have enough time to open boot-json.log and see what is going on.
At last, I put my Python setup solution here that may help someone else:
main.py
...
model_path = os.path.dirname(os.path.abspath(__file__)) + '/models/net.pd'
# pipeline code
...
MANIFEST.in
include models/*.*
setup.py complete example
REQUIRED_PACKAGES = [...]
setuptools.setup(
...
include_package_data=True,
install_requires=REQUIRED_PACKAGES,
packages=setuptools.find_packages(),
package_data={"models": ["models/*"]},
...
)
Run Dataflow pipelines
$ python main.py --setup_file=/absolute/path/to/setup.py ...
I'm attempting to run stratum-mining-proxy with minerd. Proxy starts and runs with the following command:
python ./mining_proxy.py -o ltc-stratum.kattare.com -p 3333 -pa scrypt
Proxy starts fine. Run Minerd (U/P removed):
minerd -a scrypt -r 1 -s 6 -o http://127.0.0.1:3333 -O USERNAME.1:PASSWORD
Following errors are received. This one from the proxy:
2013-07-18 01:33:59,981 ERROR protocol protocol.dataReceived # Processing of message failed
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 185, in dataReceived
self.lineReceived(line, request_counter)
File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 216, in lineReceived
raise custom_exceptions.ProtocolException("Cannot decode message '%s'" % line)
'rotocolException: Cannot decode message 'POST / HTTP/1.1
And this from minerd. What am I doing wrong? Any help is appreciated!
[2013-07-18 01:33:59] HTTP request failed: Empty reply from server
[2013-07-18 01:33:59] json_rpc_call failed, retry after 30 seconds
I am a little curious, I don't know as a fact but I was under the impression that the mining proxy was for BTC not LTC.
But anyways I believe I got a similar message when I first installed it as well. To fix, or rather to actually get it running I had to use the Git installation method instead of installing manually.
Installation on Linux using Git
This is advanced option for experienced users, but give you the easiest way for updating the proxy.
1.git clone git://github.com/slush0/stratum-mining-proxy.git
2.cd stratum-mining-proxy
3.sudo apt-get install python-dev # Development package of Python are necessary
4.sudo python distribute_setup.py # This will upgrade setuptools package
5.sudo python setup.py develop # This will install required dependencies (namely Twisted and Stratum libraries), but don't install the package into the system.
6.You can start the proxy by typing "./mining_proxy.py" in the terminal window. Using default settings, proxy connects to Slush's pool interface.
7.If you want to connect to another pool or change other proxy settings, type "./mining_proxy.py --help".
8.If you want to update the proxy, type "git pull" in the package directory.