Coder Social home page Coder Social logo

Comments (11)

remingtonc avatar remingtonc commented on June 5, 2024 1

@AndrewC-B Regarding loading the data into ES, give this a shot:

cd tdm/
# git stash if necessary
git pull
git checkout etl-search-entry
docker build -t tdm/etl etl/
docker-compose run --rm etl python main.py --stage search

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

@AndrewC-B Hmm that is unfortunate - can you provide any details on the deployment? Using docker-compose? search should be resolved to the associated ElasticSearch instance. Were there any errors deploying the search container? Would likely be named tdm_search_1 or something along those lines. If you're trying to use a baremetal ElasticSearch outside of the usual deployment, we will need to edit a couple files but can fix that easily.

from tdm.

AndrewC-B avatar AndrewC-B commented on June 5, 2024

Yep, using docker-compose. I don't recall seeing any errors during deployment/setup (but that doesn't mean there weren't any - just that there weren't any in the most recent output when start.sh exited). Is there a log I could inspect to check?

EDIT: I'm not familiar with Docker, but I did a docker-compose restart search and didn't see any errors. Also, how can I prompt TDM to attempt to populate the search database again? reset.sh presumably wipes out the local database (which I hope isn't necessary).

EDIT2: Running start.sh again gives the output shown below. The web frontend works and I can look up OID paths under "DataPath Direct", but everything I've tried so far shows "no datapaths mapped as matches", and no mappings show up under "All Mappings". Search also returns no results. (All of which is, I think, consistent with there being an issue with the search container).

Sending build context to Docker daemon  1.123GB
Step 1/10 : FROM python:3-alpine
 ---> aadc3feb2b19
Step 2/10 : RUN apk add --update git gcc libc-dev libxslt-dev
 ---> Using cache
 ---> c8ca9a09ecf9
Step 3/10 : RUN pip install --no-cache-dir pipenv
 ---> Using cache
 ---> 7bde921c5123
Step 4/10 : COPY src/Pipfile /data/Pipfile
 ---> Using cache
 ---> 9760056cc639
Step 5/10 : WORKDIR /data/
 ---> Using cache
 ---> edc57649f218
Step 6/10 : RUN pipenv --three install
 ---> Using cache
 ---> 751ef95c03c1
Step 7/10 : COPY src/ /data/
 ---> Using cache
 ---> 3ab208cac25d
Step 8/10 : WORKDIR /data/
 ---> Using cache
 ---> a5028b3a6245
Step 9/10 : ENTRYPOINT [ "pipenv", "run" ]
 ---> Using cache
 ---> 5f10412a6fe0
Step 10/10 : CMD [ "python", "main.py" ]
 ---> Using cache
 ---> 358ab92c2b4e
Successfully built 358ab92c2b4e
Successfully tagged tdm/etl:latest
Sending build context to Docker daemon  3.054MB
Step 1/13 : FROM python:3-alpine
 ---> aadc3feb2b19
Step 2/13 : RUN pip install --no-cache-dir pip==18.0
 ---> Using cache
 ---> 1a57fe70b567
Step 3/13 : RUN pip install --no-cache-dir pipenv
 ---> Using cache
 ---> 5cc2a64d9474
Step 4/13 : COPY src/Pipfile /data/Pipfile
 ---> Using cache
 ---> dc4fb4c3d08d
Step 5/13 : WORKDIR /data/
 ---> Using cache
 ---> 148342938927
Step 6/13 : RUN apk add --no-cache --virtual .build-deps gcc musl-dev
 ---> Using cache
 ---> daca75b67d1b
Step 7/13 : RUN pipenv --three install
 ---> Using cache
 ---> 6c3cfafb3b9e
Step 8/13 : RUN apk del .build-deps gcc musl-dev
 ---> Using cache
 ---> 861618b6bb5d
Step 9/13 : COPY src/ /data/
 ---> Using cache
 ---> bce303c990e6
Step 10/13 : WORKDIR /data/
 ---> Using cache
 ---> b9d6af888322
Step 11/13 : EXPOSE 80
 ---> Using cache
 ---> b521493df0c1
Step 12/13 : ENTRYPOINT [ "pipenv", "run" ]
 ---> Using cache
 ---> 06c04fab7956
Step 13/13 : CMD [ "python", "runserver.py" ]
 ---> Using cache
 ---> 374dbd40e146
Successfully built 374dbd40e146
Successfully tagged tdm/web:latest
tdm_goaccess_1 is up-to-date
tdm_nginx_1 is up-to-date
tdm_web_1 is up-to-date
tdm_dbms_1 is up-to-date
Starting tdm_etl_1 ... 
Starting tdm_etl_1    ... done
Starting tdm_search_1 ... done

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

@AndrewC-B It does look like the Elasticsearch container went down or didn't start based on that output. docker logs tdm_search_1 might have logs from the first execution which might be enlightening.

Usually, you would have to restart the entire process, but it's probably time to change that. :) I'm in an area with terrible internet connection at the moment, but uglypatching it in branch etl-search-entry. I'll ping you when it's tested with docs on how to restart the ETL process, jumping straight to the search stage.

from tdm.

AndrewC-B avatar AndrewC-B commented on June 5, 2024

Thanks! The full log is here. The most obvious issue to me is:

ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2018-12-05T22:40:59,025][INFO ][o.e.n.Node               ] [RL38epu] stopping ...
[2018-12-05T22:40:59,051][INFO ][o.e.n.Node               ] [RL38epu] stopped
[2018-12-05T22:40:59,052][INFO ][o.e.n.Node               ] [RL38epu] closing ...
[2018-12-05T22:40:59,058][INFO ][o.e.n.Node               ] [RL38epu] closed
[2018-12-05T22:40:59,060][INFO ][o.e.x.m.j.p.NativeController] Native controller process has stopped - no new native processes can be started

Does this seem like a likely cause of the issues I'm seeing to you? If so, can you recommend steps to rectify? A quick Google search suggests something like this, from here:

docker-machine ssh
sudo sysctl -w vm.max_map_count=262144
exit

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

That definitely looks like the culprit. I'm not using Windows which makes it difficult to test, but reading through the thread those commands look like the best bet, or to set discovery.type=single-node environment variable for ES to put it into dev mode e.g. here.

from tdm.

AndrewC-B avatar AndrewC-B commented on June 5, 2024

Cheers! I reset overnight. Search now returns results, although All Mappings is still empty (and all the paths I've tried so far say "no datapaths mapped as matches", although some have a "matching datapath"). At a guess, are mapped paths manually defined, and matched paths computer generated? And any ideas about All Mappings being empty?

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

Smart :) The mappings are not automated, and the existing mappings are not open-source (yet?) nor is there a public-facing instance for community effort. We're figuring that out. At the moment, we're tackling the human problem and crowd-sourcing, and developing tools to attempt to automate the mappings to then put into TDM.

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

Feel free to open a separate issue on the mappings specifically if this is resolved.

from tdm.

AndrewC-B avatar AndrewC-B commented on June 5, 2024

Thanks!

EDIT: Oh, just in case it matters, my environment is Linux, not Windows.

from tdm.

remingtonc avatar remingtonc commented on June 5, 2024

Ah yes, then you would simply change the sysctl parameters (no docker-machine ssh necessary).

If you have any questions feel free to open an issue or email [email protected] (or [email protected]).

from tdm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.