:mag: Ambar: Document Search Engine

Overview

Version License

🔍 Ambar: Document Search Engine

Ambar Search

Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search.

Ambar defines a new way to implement full-text document search into your workflow.

  • Easily deploy Ambar with a single docker-compose file
  • Perform Google-like search through your documents and contents of your images
  • Tag your documents
  • Use a simple REST API to integrate Ambar into your workflow

Features

Search

Tutorial: Mastering Ambar Search Queries

  • Fuzzy Search (John~3)
  • Phrase Search ("John Smith")
  • Search By Author (author:John)
  • Search By File Path (filename:*.txt)
  • Search By Date (when: yesterday, today, lastweek, etc)
  • Search By Size (size>1M)
  • Search By Tags (tags:ocr)
  • Search As You Type
  • Supported language analyzers: English ambar_en, Russian ambar_ru, German ambar_de, Italian ambar_it, Polish ambar_pl, Chinese ambar_cn, CJK ambar_cjk

Crawling

Ambar 2.0 only supports local fs crawling, if you need to crawl an SMB share of an FTP location - just mount it using standard linux tools. Crawling is automatic, no schedule is needed due to crawlers monitor file system events and automatically process new, changed and removed files.

Content Extraction

Ambar supports large files (>30MB)

Supported file types:

  • ZIP archives
  • Mail archives (PST)
  • MS Office documents (Word, Excel, Powerpoint, Visio, Publisher)
  • OCR over images
  • Email messages with attachments
  • Adobe PDF (with OCR)
  • OCR languages: Eng, Rus, Ita, Deu, Fra, Spa, Pl, Nld
  • OpenOffice documents
  • RTF, Plaintext
  • HTML / XHTML
  • Multithread processing

Installation

Notice: Ambar requires Docker to run

You can build Docker images by yourself or buy prebuilt Docker images for $50 here.

  • Installation instruction for prebuilt images: here
  • Tutorial on how to build images from scratch see below

If you want to see how Ambar works w/o installing it, try our live demo. No signup required.

Building the images yourself

All the images required to run Ambar can be built locally. In general, each image can be built by navigating into the directory of the component in question, performing the compilation steps required and building the image like that:

# From project root
$ cd FrontEnd
$ docker build . -t <image_name>

The resulting image can be referred to by the name specified, and run by the containerization tooling of your choice.

In order to use a local Dockerfile with docker-compose, simply change the image option to build, setting the value to the relative path of the directory containing the Dockerfile. Then run docker-compose build to build the relevant images. For example:

# docker-compose.yml from project root, referencing local dockerfiles
pipeline0:
  build: ./Pipeline/
image: chazu/ambar-pipeline
  localcrawler:
    image: ./LocalCrawler/

Note that some of the components require compilation or other build steps be performed on the host before the docker images can be built. For example, FrontEnd:

# Assuming a suitable version of node.js is installed (docker uses 8.10)
$ npm install
$ npm run compile

FAQ

Is it open-source?

Yes, it's fully open-source.

Is it free?

Yes, it is forever free and open-source.

Does it perform OCR?

Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents. You can easily find all files on which OCR was perfomed with tags:ocr query

Which languages are supported for OCR?

Supported languages: Eng, Rus, Ita, Deu, Fra, Spa, Pl, Nld. If you miss your language please contact us on [email protected].

Does it support tagging?

Yes!

What about searching in PDF?

Yes, it can search through any PDF, even badly encoded or with scans inside. We did our best to make search over any kind of pdf document smooth.

What is the maximum file size it can handle?

It's limited by amount of RAM on your machine, typically it's 500MB. It's an awesome result, as typical document managment systems offer 30MB maximum file size to be processed.

I have a problem what should I do?

Request a dedicated support session by mailing us on [email protected]

Sponsors

Change Log

Change Log

Privacy Policy

Privacy Policy

License

MIT License

Comments
  • Bug: Fresh install, going to server IP redirects me to

    Bug: Fresh install, going to server IP redirects me to "https://frontend"

    Hey all. I tried to freshly install this using the directions and the ambar.py script. Running Ubuntu Server 17.04. Initially it said ambar running on http://:80 but putting in the IP into the config under fe and host gets it to say http://i.p.i.p:80 but still no change.

    Anyone have any ideas what I might be doing wrong? I can provide any more info needed.

    Thanks, hbh7

    help wanted 
    opened by hbh7 27
  • Ambar behind a HTTP proxy

    Ambar behind a HTTP proxy

    Hi,

    I seem to be having problems with the ambar_webapi docker not using the system HTTP proxy correctly.

    I have installed ambar self-hosted community edition onto Centos 7, which is behind a HTTP proxy. I have setup systemd for docker to define the HTTP_PROXY and HTTPS_PROXY correctly. ie I can download/book amabar ok.

    I can then access the web front end ok, (changed to port 8005) but everything else is standard.. however I can't login, signup or anything - get an 'opps something went wrong message'.

    Inspecting the docker log for ambar_webapi seems to show attempts to access a remote host (52.64.9.77) (and amazonaws.com host - mandrillapp.com??) without using the HTTPS proxy ` [[email protected] ambar]# ./ambar.py start


    /\ _ \ /'_/\/\ _\ /\ _ /\ `\
    \ \ \L\ /\ \ \ \L\ \ \ \L\ \ \ \L\ \
    \ \ __ \ \ _
    \ \ \ _ <'\ \ __ \ \ , /
    \ \ /\ \ \ _/\ \ \ \L\ \ \ /\ \ \ \ \
    \ _\ _\ _\ _\ _/ \ _\ _\ _\ _
    /
    ///// ///
    / /////_// /

    Docker version 17.04.0-ce, build 4845c56 docker-compose version 1.13.0, build 1719ceb vm.max_map_count = 262144 net.ipv4.ip_local_port_range = 15000 61000 net.ipv4.tcp_fin_timeout = 30 net.core.somaxconn = 1024 net.core.netdev_max_backlog = 2000 net.ipv4.tcp_max_syn_backlog = 2048 Creating network "ambar_internal_network" with the default driver Creating ambar_db_1 ... Creating ambar_rabbit_1 ... Creating ambar_proxy_1 ... Creating ambar_es_1 ... Creating ambar_db_1 Creating ambar_webapi-cache_1 ... Creating ambar_rabbit_1 Creating ambar_es_1 Creating ambar_webapi-cache_1 Creating ambar_es_1 ... done Creating ambar_webapi_1 ... Creating ambar_webapi_1 ... done Creating ambar_frontend_1 ... Creating ambar_frontend_1 ... done Waiting for Ambar to start... Ambar is running on http://147.66.12.53:8005 [[email protected] ambar]# cat docker inspect --format='{{.LogPath}}' ambar_webapi_1 {"log":"2017/05/03 05:44:08 Waiting for host: \n","stream":"stderr","time":"2017-05-03T05:44:08.385748978Z"} {"log":"2017/05/03 05:44:08 Waiting for host: es:9200\n","stream":"stderr","time":"2017-05-03T05:44:08.385884331Z"} {"log":"2017/05/03 05:44:08 Connected to unix:///var/run/docker.sock\n","stream":"stderr","time":"2017-05-03T05:44:08.388017144Z"} {"log":"2017/05/03 05:44:22 Received 200 from http://es:9200\n","stream":"stderr","time":"2017-05-03T05:44:22.302769292Z"} {"log":"Crawler schedule service initialized\n","stream":"stdout","time":"2017-05-03T05:44:24.380922736Z"} {"log":"Pipeline initialized\n","stream":"stdout","time":"2017-05-03T05:44:24.71064609Z"} {"log":"Started on :::8080\n","stream":"stdout","time":"2017-05-03T05:44:24.720793191Z"} {"log":"{ [Error: connect ECONNREFUSED 52.64.27.232:443]\n","stream":"stderr","time":"2017-05-03T06:53:42.270438821Z"} {"log":" code: 'ECONNREFUSED',\n","stream":"stderr","time":"2017-05-03T06:53:42.270489177Z"} {"log":" errno: 'ECONNREFUSED',\n","stream":"stderr","time":"2017-05-03T06:53:42.270497139Z"} {"log":" syscall: 'connect',\n","stream":"stderr","time":"2017-05-03T06:53:42.270503494Z"} {"log":" address: '52.64.27.232',\n","stream":"stderr","time":"2017-05-03T06:53:42.27050999Z"} {"log":" port: 443 }\n","stream":"stderr","time":"2017-05-03T06:53:42.270516275Z"} {"log":"{ [Error: connect ECONNREFUSED 52.64.9.77:443]\n","stream":"stderr","time":"2017-05-03T06:53:43.182362118Z"} {"log":" code: 'ECONNREFUSED',\n","stream":"stderr","time":"2017-05-03T06:53:43.18240549Z"} {"log":" errno: 'ECONNREFUSED',\n","stream":"stderr","time":"2017-05-03T06:53:43.182413382Z"} {"log":" syscall: 'connect',\n","stream":"stderr","time":"2017-05-03T06:53:43.182444112Z"} {"log":" address: '52.64.9.77',\n","stream":"stderr","time":"2017-05-03T06:53:43.182451306Z"} {"log":" port: 443 }\n","stream":"stderr","time":"2017-05-03T06:53:43.182457382Z"} ` it seems that when I try to recover my password I type my email, and hit 'recover password' causes a new entry in the ambar_webapi docker log which looks like our http proxy (see above).

    I have not yet been able to login at all to the Ambar web front end.

    any ideas?

    Regards Kym

    bug 
    opened by knewbery 24
  • Initial e-mail does not arrive

    Initial e-mail does not arrive

    Hi. What conditions must be met to successfully send login credentials? I'm trying your brilliant software in my internal network and cannot use auth 'none' for security reasons. Here's my specs: Ubuntu 16.04.03 Docker: 17.09.1-ce Docker-compose: 1.18.0

    Thanks.

    bug 
    opened by nonylion 20
  • "Oops.... Something went wrong" during loading

    It seems like the api is not accessible, even though installation went without any apparent issue. During loading of the page, I get the error "Oops.... Something went wrong" at the bottom. It looks like the ambar-webapi container is restarting every 5 minutes due to not connecting to the ambar-es container?

    [email protected]:~$ sudo ./ambar.py start
    
    
    ______           ____     ______  ____
    /\  _  \  /'\_/`\/\  _`\  /\  _  \/\  _`\
    \ \ \L\ \/\      \ \ \L\ \ \ \L\ \ \ \L\ \
     \ \  __ \ \ \__\ \ \  _ <'\ \  __ \ \ ,  /
      \ \ \/\ \ \ \_/\ \ \ \L\ \ \ \/\ \ \ \ \
       \ \_\ \_\ \_\ \_\ \____/ \ \_\ \_\ \_\ \_\
        \/_/\/_/\/_/ \/_/\/___/   \/_/\/_/\/_/\/ /
    
    
    
    Docker version 17.03.1-ce, build c6d412e
    docker-compose version 1.11.2, build dfed245
    vm.max_map_count = 262144
    net.ipv4.ip_local_port_range = 15000 61000
    net.ipv4.tcp_fin_timeout = 30
    net.core.somaxconn = 1024
    net.core.netdev_max_backlog = 2000
    net.ipv4.tcp_max_syn_backlog = 2048
    ambar_db_1 is up-to-date
    ambar_es_1 is up-to-date
    ambar_rabbit_1 is up-to-date
    ambar_frontend_1 is up-to-date
    ambar_webapi_1 is up-to-date
    ambar_webapi-cache_1 is up-to-date
    Waiting for Ambar to start...
    Ambar is running on http://10.20.30.13:80
    

    ambar-webapi container log output:

    2017/04/07 05:08:51 Timeout after 5m0s waiting on dependencies to become available: [unix:///var/run/docker.sock http://es:9200]
    2017/04/07 05:08:52 Waiting for host:
    2017/04/07 05:08:52 Waiting for host: es:9200
    2017/04/07 05:08:52 Connected to unix:///var/run/docker.sock
    2017/04/07 05:13:52 Timeout after 5m0s waiting on dependencies to become available: [unix:///var/run/docker.sock http://es:9200]
    2017/04/07 05:13:52 Waiting for host:
    2017/04/07 05:13:52 Waiting for host: es:9200
    2017/04/07 05:13:52 Connected to unix:///var/run/docker.sock
    2017/04/07 05:18:52 Timeout after 5m0s waiting on dependencies to become available: [unix:///var/run/docker.sock http://es:9200]
    2017/04/07 05:18:52 Waiting for host:
    2017/04/07 05:18:52 Waiting for host: es:9200
    2017/04/07 05:18:52 Connected to unix:///var/run/docker.sock
    

    ambar-es container logs:

    [2017-04-07T05:22:01,567][INFO ][o.e.n.Node               ] [BtkYnk-] stopping ...
    [2017-04-07T05:22:01,633][INFO ][o.e.n.Node               ] [BtkYnk-] stopped
    [2017-04-07T05:22:01,633][INFO ][o.e.n.Node               ] [BtkYnk-] closing ...
    [2017-04-07T05:22:01,646][INFO ][o.e.n.Node               ] [BtkYnk-] closed
    [2017-04-07T05:22:03,494][INFO ][o.e.n.Node               ] [] initializing ...
    [2017-04-07T05:22:03,612][INFO ][o.e.e.NodeEnvironment    ] [BtkYnk-] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/onlyoffice--vg-root)]], net usable_space [34.7gb], net total_space [46.6gb], spins? [possibly], types [ext4]
    [2017-04-07T05:22:03,612][INFO ][o.e.e.NodeEnvironment    ] [BtkYnk-] heap size [1007.3mb], compressed ordinary object pointers [true]
    [2017-04-07T05:22:03,660][INFO ][o.e.n.Node               ] node name [BtkYnk-] derived from node ID [BtkYnk-rRXGLNCk4JZeisA]; set [node.name] to override
    [2017-04-07T05:22:03,665][INFO ][o.e.n.Node               ] version[5.2.2], pid[1], build[f9d9b74/2017-02-24T17:26:45.835Z], OS[Linux/4.4.0-72-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_121/25.121-b13]
    [2017-04-07T05:22:05,239][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [aggs-matrix-stats]
    [2017-04-07T05:22:05,239][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [ingest-common]
    [2017-04-07T05:22:05,239][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [lang-expression]
    [2017-04-07T05:22:05,239][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [lang-groovy]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [lang-mustache]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [lang-painless]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [percolator]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [reindex]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [transport-netty3]
    [2017-04-07T05:22:05,240][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded module [transport-netty4]
    [2017-04-07T05:22:05,242][INFO ][o.e.p.PluginsService     ] [BtkYnk-] loaded plugin [analysis-morphology]
    [2017-04-07T05:22:05,395][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
    [2017-04-07T05:22:08,149][INFO ][o.e.n.Node               ] initialized
    [2017-04-07T05:22:08,150][INFO ][o.e.n.Node               ] [BtkYnk-] starting ...
    [2017-04-07T05:22:08,258][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: f5:84:67:88:74:e6:c5:b2
    [2017-04-07T05:22:08,326][INFO ][o.e.t.TransportService   ] [BtkYnk-] publish_address {172.19.0.3:9300}, bound_addresses {[::]:9300}
    [2017-04-07T05:22:08,335][INFO ][o.e.b.BootstrapChecks    ] [BtkYnk-] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
    [2017-04-07T05:22:11,400][INFO ][o.e.c.s.ClusterService   ] [BtkYnk-] new_master {BtkYnk-}{BtkYnk-rRXGLNCk4JZeisA}{bcr5fJbTS6WeNLWTn3-wbg}{172.19.0.3}{172.19.0.3:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
    [2017-04-07T05:22:11,419][INFO ][o.e.h.HttpServer         ] [BtkYnk-] publish_address {172.19.0.3:9200}, bound_addresses {[::]:9200}
    [2017-04-07T05:22:11,419][INFO ][o.e.n.Node               ] [BtkYnk-] started
    [2017-04-07T05:22:11,669][INFO ][o.e.g.GatewayService     ] [BtkYnk-] recovered [2] indices into cluster_state
    [2017-04-07T05:22:12,231][INFO ][o.e.c.r.a.AllocationService] [BtkYnk-] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[ambar_log_record_data][7]] ...]).
    
    bug 
    opened by agreenfield1 18
  • Cannot view/download files

    Cannot view/download files

    Hi,

    I am struggling to understand how to access my files from the Web interface?

    Is there meant to be a download button? I can find the image preview, but that is it..

    image

    bug 
    opened by dandantheflyingman 15
  • SMB crawler not working, share verified working

    SMB crawler not working, share verified working

    Installed clean today on clean Ubuntu 16.04 install. Verified I can connect to the share from Windows and Linux using mount -t cifs. Crawler config:

    { "id": "data", "uid": "data_d033e22ae348aeb5660fc2140aec35850c4da997", "description": "nas crawler", "type": "smb", "locations": [ { "host_name": "nas", "ip_address": "10.0.0.100", "location": "data" } ], "file_regex": "(\.doc[a-z]$)|(\.xls[a-z]$)|(\.txt$)|(\.csv$)|(\.htm[a-z]$)|(\.ppt[a-z]$)|(\.pdf$)|(\.msg$)|(\.eml$)|(\.rtf$)|(\.md$)|(\.png$)|(\.bmp$)|(\.tif[f]$)|(\.jp[e]g$)|(\.hwp$)", "credentials": { "auth_type": "ntlm", "login": "jes", "password": "*****", "token": "" }, "schedule": { "is_active": true, "cron_schedule": "/15 * * * *" }, "max_file_size_bytes": 30000000, "verbose": true }

    Error: 2017-07-14 11:15:00.688: [info] filecrawler initialized 2017-07-14 11:15:00.695: [error] 2017-07-14 11:15:00.700: [error] error connecting to Smb share on nas

    Notice that there is nothing by the error at all.

    Also, how do I get to the logs for this system? I looked at docker logs but they said nothing about this issue. Thank you.

    help wanted 
    opened by effnorwood 15
  • Ambar is loading ...

    Ambar is loading ...

    I followed the step-by-step with same environment, ubuntu server 16.04LTS. Docker CE version 17.06.2 However, I got "Ambar is loading..." "Oops something went wrong" message. I saw same error message in closed issue. Please advise.

    help wanted 
    opened by andychoi 14
  • Invalid port specification:

    Invalid port specification: "None"

    [[email protected] ambar]# ./ambar.py start
     
    
    ______           ____     ______  ____       
    /\  _  \  /'\_/`\/\  _`\  /\  _  \/\  _`\    
    \ \ \L\ \/\      \ \ \L\ \ \ \L\ \ \ \L\ \  
     \ \  __ \ \ \__\ \ \  _ <'\ \  __ \ \ ,  /   
      \ \ \/\ \ \ \_/\ \ \ \L\ \ \ \/\ \ \ \ \  
       \ \_\ \_\ \_\ \_\ \____/ \ \_\ \_\ \_\ \_\
        \/_/\/_/\/_/ \/_/\/___/   \/_/\/_/\/_/\/ /
    
    
                                                  
    Docker version 1.12.1, build 23cf638
    docker-compose version 1.12.0, build b31ff33
    vm.max_map_count = 262144
    net.ipv4.ip_local_port_range = 15000 61000
    net.ipv4.tcp_fin_timeout = 30
    net.core.somaxconn = 1024
    net.core.netdev_max_backlog = 2000
    net.ipv4.tcp_max_syn_backlog = 2048
    Creating ambar_db_1
    Creating ambar_rabbit_1
    Creating ambar_es_1
    Creating ambar_frontend_1
    
    ERROR: for es  Cannot create container for service es: b'Invalid port specification: "None"'
    
    ERROR: for db  Cannot create container for service db: b'Invalid port specification: "None"'
    
    ERROR: for rabbit  Cannot create container for service rabbit: b'Invalid port specification: "None"'
    
    ERROR: for frontend  Cannot create container for service frontend: b'Invalid port specification: "None"'
    ERROR: Encountered errors while bringing up the project.
    Traceback (most recent call last):
      File "./ambar.py", line 218, in <module>
        start(configuration)
      File "./ambar.py", line 187, in start
        runShellCommandStrict('docker-compose -f {0}/docker-compose.yml -p ambar up -d'.format(PATH))
      File "./ambar.py", line 45, in runShellCommandStrict
        subprocess.check_call(command, shell = True)
      File "/usr/local/lib/python3.5/subprocess.py", line 584, in check_call
        raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command 'docker-compose -f /root/ambar/docker-compose.yml -p ambar up -d' returned non-zero exit status 1
    
    opened by kirichenko 14
  • Tag by folder

    Tag by folder

    I'm trying to follow what happened in Issue #175 but am unable to reproduce his results.

    Here's my code:

    def AutoTagAmbarFile(self, AmbarFile): self.SetOCRTag(AmbarFile) self.SetSourceIdTag(AmbarFile) self.SetArchiveTag(AmbarFile) self.SetImageTag(AmbarFile) self.SetFolderTag(AmbarFile)

    Followed by this:

    def SetFolderTag(self, AmbarFile): if('folderName' in AmbarFile['meta']['full_name']): self.AddTagToAmbarFile(AmbarFile['file_id'], AmbarFile['meta']['full_name'] ,self.AUTO_TAG_TYPE, 'folderName')

    I've tried altering a pre-existing tag as did the poster in Issue #175 , but was unable to see any change after I rebuilt the Pipeline image, pulled the new image, and spun up a new instance of AMBAR. I've tried clearing my browser cache, as that had caused issues in the past, but there was no change.

    Is there somewhere else I need to change some code in order for the new tag to show up on the search page?

    Thanks in advance for any help you can offer!

    opened by s1rk1t 13
  • ERROR: for serviceapi  Container

    ERROR: for serviceapi Container "xxxx" is unhealthy.

    Hi,

    I received this error while trying to start docker : I think there is a problem with ElasticSearch service. sudo docker-compose up -d root_db_1 is up-to-date root_es_1 is up-to-date root_rabbit_1 is up-to-date root_redis_1 is up-to-date ERROR: for serviceapi Container "b5182a16944e" is unhealthy. ERROR: Encountered errors while bringing up the project.

    `version: "2.1" networks: internal_network: services: db: restart: always networks: - internal_network image: ambar/ambar-mongodb:2.0.1 environment: - cacheSizeGB=2 volumes: - /home/docker/db:/data/db expose: - "27017" ports: - "27017:27017" es: restart: always networks: - internal_network image: ambar/ambar-es:2.0.1 expose: - "9200" ports: - "9200:9200" environment: - cluster.name=ambar-es - ES_JAVA_OPTS=-Xms2g -Xmx2g ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 hard: 65536 cap_add: - IPC_LOCK volumes: - /home/docker/es:/usr/share/elasticsearch/data rabbit: restart: always networks: - internal_network image: ambar/ambar-rabbit:2.0.1 hostname: rabbit expose: - "15672" - "5672" ports: - "15672:15672" - "5672:5672" volumes: - /home/docker/rabbit:/var/lib/rabbitmq redis: restart: always sysctls: - net.core.somaxconn=1024 networks: - internal_network image: ambar/ambar-redis:2.0.1 expose: - "6379" ports: - "6379:6379" serviceapi: depends_on: redis: condition: service_healthy rabbit: condition: service_healthy es: condition: service_healthy db: condition: service_healthy restart: always networks: - internal_network image: ambar/ambar-serviceapi:2.0.1 expose: - "8081" ports: - "8081:8081" environment: - mongoDbUrl=mongodb://db:27017/ambar_data - elasticSearchUrl=http://es:9200 - redisHost=redis - redisPort=6379 - rabbitHost=amqp://rabbit - langAnalyzer=ambar_en volumes: - /var/run/docker.sock:/var/run/docker.sock webapi: depends_on: serviceapi: condition: service_healthy restart: always networks: restart: always networks: - internal_network image: ambar/ambar-webapi:2.0.1 expose: - "8080" ports: - "8080:8080" environment: - analyticsToken=cda4b0bb11a1f32aed7564b08c455992 - uiLang=en - mongoDbUrl=mongodb://db:27017/ambar_data - elasticSearchUrl=http://es:9200 - redisHost=redis - redisPort=6379 - serviceApiUrl=http://serviceapi:8081 - rabbitHost=amqp://rabbit volumes: - /var/run/docker.sock:/var/run/docker.sock frontend: depends_on: webapi: condition: service_healthy image: ambar/ambar-frontend:2.0.1 restart: always networks: - internal_network ports: - "80:80" expose: - "80" environment: - api=http://145.239.139.196:8080 pipeline0: depends_on: serviceapi: condition: service_healthy image: ambar/ambar-pipeline:2.0.1 restart: always networks: - internal_network environment: - id=0 - api_url=http://serviceapi:8081 - rabbit_host=amqp://rabbit crawler0: depends_on: serviceapi: condition: service_healthy image: ambar/ambar-local-crawler restart: always image: ambar/ambar-local-crawler restart: always networks: - internal_network environment: - apiUrl=http://serviceapi:8081 - crawlPath=/usr/data - name=craw volumes: - /home/docker/2:/usr/data

    opened by mizbanpaytakht 13
  • Getting lots of the following 2 errors running docker-compose build

    Getting lots of the following 2 errors running docker-compose build

    Running latest off master I see lots of index_not_found errors. At what point and whose responsibility is it to post the index to es?

    serviceapi_1    | { Error: [index_not_found_exception] no such index, with { resource.type="index_or_alias" & resource.id="ambar_file_data" & index_uuid="_na_" & index="ambar_file_data" }
    serviceapi_1    |     at respond (/node_modules/elasticsearch/src/lib/transport.js:289:15)
    serviceapi_1    |     at checkRespForFailure (/node_modules/elasticsearch/src/lib/transport.js:248:7)
    serviceapi_1    |     at HttpConnector.<anonymous> (/node_modules/elasticsearch/src/lib/connectors/http.js:164:7)
    serviceapi_1    |     at IncomingMessage.wrapper (/node_modules/lodash/lodash.js:4929:19)
    serviceapi_1    |     at emitNone (events.js:111:20)
    serviceapi_1    |     at IncomingMessage.emit (events.js:208:7)
    serviceapi_1    |     at endReadableNT (_stream_readable.js:1064:12)
    serviceapi_1    |     at _combinedTickCallback (internal/process/next_tick.js:138:11)
    serviceapi_1    |     at process._tickCallback (internal/process/next_tick.js:180:9)
    serviceapi_1    |   status: 404,
    serviceapi_1    |   displayName: 'NotFound',
    serviceapi_1    |   message: '[index_not_found_exception] no such index, with { resource.type="index_or_alias" & resource.id="ambar_file_data" & index_uuid="_na_" & index="ambar_file_data" }',
    serviceapi_1    |   path: '/ambar_file_data/ambar_file/_search',
    serviceapi_1    |   query: { _source: 'false' },
    serviceapi_1    |   body:
    serviceapi_1    |    { error:
    serviceapi_1    |       { root_cause: [Array],
    serviceapi_1    |         type: 'index_not_found_exception',
    serviceapi_1    |         reason: 'no such index',
    serviceapi_1    |         'resource.type': 'index_or_alias',
    serviceapi_1    |         'resource.id': 'ambar_file_data',
    serviceapi_1    |         index_uuid: '_na_',
    serviceapi_1    |         index: 'ambar_file_data' },
    serviceapi_1    |      status: 404 },
    serviceapi_1    |   statusCode: 404,
    serviceapi_1    |   response: '{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"ambar_file_data","index_uuid":"_na_","index":"ambar_file_data"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"ambar_file_data","index_uuid":"_na_","index":"ambar_file_data"},"status":404}',
    serviceapi_1    |   toString: [Function],
    serviceapi_1    |   toJSON: [Function] }
    
    wontfix 
    opened by AYapejian 12
  • no basic auth credentials

    no basic auth credentials

    Hi! I bought a prebuilt image and got the instructions in the letter. I logged in with the information they sent me. But anytime i try to docker-compose pull i get error no basic auth credentials . What should I do?

    opened by sylzerret 0
  • fix: LocalCrawler/Dockerfile to reduce vulnerabilities

    fix: LocalCrawler/Dockerfile to reduce vulnerabilities

    The following vulnerabilities are fixed with an upgrade:

    • https://snyk.io/vuln/SNYK-DEBIAN8-GIT-340820
    • https://snyk.io/vuln/SNYK-DEBIAN8-GIT-340907
    • https://snyk.io/vuln/SNYK-DEBIAN8-PROCPS-309313
    • https://snyk.io/vuln/SNYK-DEBIAN8-WGET-300469
    • https://snyk.io/vuln/SNYK-UPSTREAM-NODE-538286
    opened by ghost 0
Releases(v2.1.18)
Owner
RD17
Creating custom software to suit any need
RD17
A :baby: buddy to help caregivers track sleep, feedings, diaper changes, and tummy time to learn about and predict baby's needs without (as much) guess work.

Baby Buddy A buddy for babies! Helps caregivers track sleep, feedings, diaper changes, tummy time and more to learn about and predict baby's needs wit

Baby Buddy 1.5k Jan 02, 2023
Conference planning tool: CfP, scheduling, speaker management

pretalx is a conference planning tool focused on providing the best experience for organisers, speakers, reviewers, and attendees alike. It handles th

492 Dec 28, 2022
:mag: Ambar: Document Search Engine

🔍 Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am

RD17 1.9k Jan 09, 2023
Agile project management platform. Built on top of Django and AngularJS

Taiga Backend Documentation Currently, we have authored three main documentation hubs: API: Our API documentation and reference for developing from Ta

Taiga.io 5.8k Jan 05, 2023
A Python library to manage ACBF ebooks.

libacbf A Python library to read and edit ACBF formatted comic book files and archives. XML Specifications here: https://acbf.fandom.com/wiki/Advanced

Grafcube 0 Nov 09, 2021
A simple shared budget manager web application

I hate money I hate money is a web application made to ease shared budget management. It keeps track of who bought what, when, and for whom; and helps

The spiral project. 829 Dec 31, 2022
Collect your thoughts and notes without leaving the command line.

jrnl To get help, submit an issue on Github. jrnl is a simple journal application for your command line. Journals are stored as human readable plain t

Manuel Ebert 31 Dec 01, 2022
115原码播放服务Kodi插件

115proxy-for-kodi 115原码播放服务Kodi插件,需要kodi 18以上版本,需配合 https://github.com/feelfar/115-for-kodi 使用 安装 由于release包尚未释出,可直接下载源代码zip包安装。 20210202:由于正调试kodi19兼

92 Jan 01, 2023
Scan, index, and archive all of your paper documents

[ en | de | el ] Important news about the future of this project It's been more than 5 years since I started this project on a whim as an effort to tr

Paperless 7.8k Jan 06, 2023
One webpage for every book ever published!

Open Library Open Library is an open, editable library catalog, building towards a web page for every book ever published. Are you looking to get star

Internet Archive 4k Jan 08, 2023
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

ArchiveBox Open-source self-hosted web archiving. ▶️ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe

ArchiveBox 14.8k Jan 05, 2023
🦉Data Version Control | Git for Data & Models

Website • Docs • Blog • Twitter • Chat (Community & Support) • Tutorial • Mailing List Data Version Control or DVC is an open-source tool for data sci

Iterative 10.9k Jan 05, 2023
cherrytree

CherryTree A hierarchical note taking application, featuring rich text and syntax highlighting, storing data in a single XML or SQLite file. The proje

Giuseppe Penone 2.7k Jan 08, 2023
:bookmark: Browser-independent bookmark manager

buku buku in action! Introduction buku is a powerful bookmark manager written in Python3 and SQLite3. When I started writing it, I couldn't find a fle

Mischievous Meerkat 5.4k Jan 02, 2023
Main repository of the zim desktop wiki project

Zim - A Desktop Wiki Editor Zim is a graphical text editor used to maintain a collection of wiki pages. Each page can contain links to other pages, si

Zim Desktop Wiki 1.6k Dec 30, 2022
Plugin-based, unopinionated membership administration software

byro is a membership administration tool for small and medium sized clubs/NGOs/associations of all kinds, with a focus on the DACH region. While it is

123 Nov 16, 2022
The official source code repository for the calibre ebook manager

calibre calibre is an e-book manager. It can view, convert, edit and catalog e-books in all of the major e-book formats. It can also talk to e-book re

Kovid Goyal 14.1k Dec 27, 2022
Open source platform for the machine learning lifecycle

MLflow: A Machine Learning Lifecycle Platform MLflow is a platform to streamline machine learning development, including tracking experiments, packagi

MLflow 13.3k Jan 04, 2023
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Exclusiv

pyMedusa 1.5k Dec 30, 2022
ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.

Collaborate This is a web application for managing and building stories based on tips solicited from the public. This project is meant to be easy to s

ProPublica 86 Oct 18, 2022