Pelican plugin that adds site search capability

Overview

Search: A Plugin for Pelican

Build Status PyPI Version

This plugin generates an index for searching content on a Pelican-powered site.

Why would you want this?

Static sites are, well, staticโ€ฆ and thus usually donโ€™t have an application server component that could be used to power site search functionality. Rather than give up control (and privacy) to third-party search engine corporations, this plugin adds elegant and self-hosted site search capability to your site. Last but not least, searches are really fast. ๐Ÿš€

Installation

This plugin uses Stork to generate a search index. Follow the Stork installation instructions to install this required command-line tool and ensure it is available within /usr/local/bin/ or another $PATH-accessible location of your choosing. For example, Stork can be installed on macOS via:

export STORKVERSION="v1.2.1"
wget -O /usr/local/bin/stork https://files.stork-search.net/releases/$STORKVERSION/stork-macos-10-15
chmod +x /usr/local/bin/stork

Confirm that Stork is properly installed via:

stork --help

Once Stork has been successfully installed and tested, this plugin can be installed via:

python -m pip install pelican-search

Settings

This pluginโ€™s behavior can be customized via Pelican settings. Those settings, and their default values, follow below.

SEARCH_MODE = "output"

In addition to plain-text files, Stork can recognize and index HTML and Markdown-formatted content. The default behavior of this plugin is to index generated HTML files, since Stork is good at extracting content from tags, scripts, and styles. But that mode may require a slight theme modification that isnโ€™t necessary when indexing Markdown source (see SEARCH_HTML_SELECTOR setting below). That said, indexing Markdown means that markup information may not be removed from the indexed content and will thus be visible in the search preview results. With that caveat aside, if you want to index Markdown source content files instead of the generated HTML output, you can use: SEARCH_MODE = "source"

SEARCH_HTML_SELECTOR = "main"

By default, Stork looks for

[โ€ฆ]
tags to determine where your main content is located. If such tags do not already exist in your themeโ€™s template files, you can either (1) add
tags or (2) change the HTML selector that Stork should look for.

To use the default main selector, in each of your themeโ€™s relevant template files, wrap the content you want to index with

tags. For example:

article.html:

<main>
{{ article.content }}
main>

page.html:

<main>
{{ page.content }}
main>

For more information, refer to Storkโ€™s documentation on HTML tag selection.

Static Assets

There are two options for serving the necessary JavaScript, WebAssembly, and CSS static assets:

  1. Use a content delivery network (CDN) to serve Storkโ€™s static assets
  2. Self-host the Stork static assets

The first option is easier to set up. The second option is provided for folks who prefer to self-host everything. After you have decided which option you prefer, follow the relevant sectionโ€™s instructions below.

Static Assets โ€” Option 1: Use CDN

CSS

Add the Stork CSS before the closing tag in your themeโ€™s base template file, such as base.html:

">
<link rel="stylesheet" href="https://files.stork-search.net/basic.css" />

If your theme supports dark mode, you may want to also add Storkโ€™s dark CSS file:

">
<link rel="stylesheet" media="screen and (prefers-color-scheme: dark)" href="https://files.stork-search.net/dark.css">

JavaScript

Add the following script tags to your themeโ€™s base template, just before your closing tag, which will load the most recent Stork module along with the matching WASM binary:

">
<script src="https://files.stork-search.net/releases/v1.2.1/stork.js">script>
<script>
    stork.register("sitesearch", "{{ SITEURL }}/search-index.st")
script>

Static Assets โ€” Option 2: Self-Host

Download the Stork JavaScript, WebAssembly, and CSS files and put them in your themeโ€™s respective static asset directories:

export STORKVERSION="v1.2.1"
cd $YOUR-THEME-DIR
mkdir -p static/{js,css}
wget -O static/js/stork.js https://files.stork-search.net/releases/$STORKVERSION/stork.js
wget -O static/js/stork.wasm https://files.stork-search.net/releases/$STORKVERSION/stork.wasm
wget -O static/css/stork.css https://files.stork-search.net/basic.css
wget -O static/css/stork-dark.css https://files.stork-search.net/dark.css

CSS

Add the Stork CSS before the closing tag in your themeโ€™s base template file, such as base.html:

">
<link rel="stylesheet" href="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/css/stork.css">

If your theme supports dark mode, you may want to also add Storkโ€™s dark CSS file:

">
<link rel="stylesheet" media="screen and (prefers-color-scheme: dark)" href="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/css/stork-dark.css">

JavaScript & WebAssembly

Add the following script tags to your themeโ€™s base template file, such as base.html, just before the closing tag:

">
<script src="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/js/stork.js">script>
<script>
    stork.initialize("{{ SITEURL }}/{{ THEME_STATIC_DIR }}/js/stork.wasm")
    stork.downloadIndex("sitesearch", "{{ SITEURL }}/search-index.st")
    stork.attach("sitesearch")
script>

Search Input Form

Decide in which place(s) on your site you want to put your search field, such as your index.html template file. Then add the search field to the template:

">
Search: <input data-stork="sitesearch" />
<div data-stork="sitesearch-output">div>

For more information regarding this topic, see the Stork search interface documentation.

Deployment

Ensure your production web server serves the WebAssembly file with the application/wasm MIME type. For folks using older versions of Nginx, that might look like the following:

โ€ฆ
http {
    โ€ฆ
    include             mime.types;
    # Types not included in older Nginx versions:
    types {
        application/wasm                                 wasm;
    }
    โ€ฆ
}

For other self-hosting considerations, see the Stork self-hosting documentation.

Contributing

Contributions are welcome and much appreciated. Every little bit helps. You can contribute by improving the documentation, adding missing features, and fixing bugs. You can also help out by reviewing and commenting on existing issues.

To start contributing to this plugin, review the Contributing to Pelican documentation, beginning with the Contributing Code section.

Comments
  • unexpected keyword argument 'capture_output'

    unexpected keyword argument 'capture_output'

    Installed the module according the installation documentation.

    But get the following error msg: CRITICAL TypeError: init() got an unexpected keyword argument 'capture_output'

    Stork --help, results in the expected output.

    I've added the plugin to the configuration, as well as the settings:

    PLUGINS = [ .....
      'post_stats', 'related_posts', 'search', 'seo', 'simple_footnotes', 'share_post', 'sitemap',
      ....
    ]
    
    SEARCH_MODE = "output"
    SEARCH_HTML_SELECTOR = "main"
    

    Can this be investigated?

    opened by radoeka 8
  • expected newline, found an identifier

    expected newline, found an identifier

    • [X] I have read the Filing Issues and subsequent โ€œHow to Get Helpโ€ sections of the documentation.
    • [X] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    • OS version and name: FreeBSD 13.0-RELEASE-p4
    • Python version:3.8.12
    • Pelican version: 4.7.1
    • Version of this plugin: 1.0.0

    Steps up to this point:

    • I installed Stork as needed for this plugin via Rustup. stork --version reports 1.3.0.
    • I set up a venv with Pelican and generated some posts using the default theme. It all generated okay.
    • I installed pelican-search via pip in the venv
    • I set edited the theme to add the CDN code and add the search box.
    • I added SEARCH_MODE = "output" and SEARCH_HTML_SELECTOR = "main" to the pelicanconf.py file.
    • I called pelican to regenerate the site.

    After that I got this error.

    CRITICAL Exception: Search plugin         __init__.py:550
                        reported Error: expected                        
                        newline, found an identifier at                 
                        line 438 column 11   
    

    Help! I do not understand Python at all, so I don't have any insight into the issue beyond what I have here. :( I am happy to provide additional information if requested (and given a little instruction on how).

    bug 
    opened by oiseaumanifesto 6
  • How to translate stork?

    How to translate stork?

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.

    Issue

    How to translate search? My site is in portuguese and the search in english:

    image question 
    opened by paulocoutinhox 3
  • Wrong URL when inside an article

    Wrong URL when inside an article

    • [x] I have read the Filing Issues and subsequent โ€œHow to Get Helpโ€ sections of the documentation.
    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    Issue

    Hi,

    When enable search, the stork url is not going to correct place when im inside article:

    image

    Example on screenshot above: http://localhost:8000/2022/06/28/2022/06/28/apocalipse-de-jesus-cristo-fase-3-o-trono-e-os-seres-viventes.html

    duplicated: 2022/06/28/2022/06/28
    

    If im on home, it is working.

    Thanks.

    bug 
    opened by paulocoutinhox 2
  • demo website ?

    demo website ?

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.
    • [ ] I am willing to lend a hand to help implement this feature. (well, ATM I don't have time, but in a few week why not)

    Feature Request

    Hello ! It could be cool to have a demo site for this plugin ! :D

    enhancement 
    opened by ebanDev 2
  • search.toml

    search.toml

    I have installed the plugin, using Flex theme. I cannot see any search.toml, or instructions on how to generate it.

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.

    Issue

    Trying to generate my theme, with the search enabled, and I get errors regarding no search.toml file - it is similar to https://github.com/pelican-plugins/search/issues/3 and it appears all I need is a search.toml file. However, I can't see any definitive / suggested configurations for pelican.

    If you could point me in the correct direction that would be superb!

    Thanks

    question 
    opened by Bobspadger 0
  • Escape double quotes in page titles in stork TOML file

    Escape double quotes in page titles in stork TOML file

    Pull Request Checklist

    Resolves: https://github.com/pelican-plugins/search/issues/3#issuecomment-1186287268

    • [x] Conformed to code style guidelines by running appropriate linting tools
    • [x] Updated documentation for changed code

    Description

    If a page title contains a double quote ("), the double quote is rendered into the stork toml file verbatim, creating a syntax error and causing stork to fail.

    This PR escapes double quotes in titles (AFAICT this is the only place where they should appear) with \", fixing the syntax errors.

    opened by s3lph 0
  • Layout issues due to Stork progress bar

    Layout issues due to Stork progress bar

    • [X ] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [ X] I have searched the documentation and believe that my question is not covered.

    I have a couple layout issues with the Stork progress bar.

    First it appears in the middle of the article (seemingly where the results box would end).

    Second, more bothersome, blank space appears on the right side of my page, which impacts usability, especially on mobiles when swiping.

    Narrowed down the issue (deleting elements in DevTools until I found the culprit) to this div:

    <div class="stork-progress" style="width: 100%; opacity: 0;"></div>
    

    which is not mine - it seemingly gets inserted by the plugin at build time, and I have not managed to override the inline style via external CSS.

    When width: 0% my layout issue disappears (because the progress bar does not have any space, ie is hidden).

    Where/how can I change width to 0%, or deactivate the progress bar entirely (ie not have this div inserted)? Perhaps better for a future version to define those inline styles in the external CSS?

    question 
    opened by ndeville 0
  • hint to include plugins in pelicanconf.py

    hint to include plugins in pelicanconf.py

    Please include a hint to include plugins=['search'] in the README.MD

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    Issue

    opened by kika21 0
  • fix(windows): convert back-slashes to forward-slashes for Windows

    fix(windows): convert back-slashes to forward-slashes for Windows

    1. Summary

    Pelican Stork search doesnโ€™t work correctly on my Windows if OUTPUT_PATH setting is custom.

    After fixing I can successfully use Pelican Stork search:

    404 page search demo

    2. MCVE files

    You can see this MCVE configuration on the KiraPelicanPluginsSitemapStork branch of my demo repository for testing Pelican.

    All files except those listed below are the result of running the command pelican-quickstart.

    1. pelicanconf.py

      """MCVE."""
      
      # [INFO] Default settings
      AUTHOR = 'Sasha Chernykh'
      SITENAME = 'SashaPelicanDebugging'
      SITEURL = 'https://kristinita.netlify.app'
      
      PATH = 'content'
      
      TIMEZONE = 'Europe/Moscow'
      
      DEFAULT_LANG = 'en'
      
      ARTICLE_PATHS = [
          'Articles'
      ]
      
      MARKDOWN = {
          'output_format': 'html5',
      }
      
      
      # [INFO] Settings for this issue
      PLUGINS = [
          'search'
      ]
      
      SEARCH_HTML_SELECTOR = 'body'
      
      OUTPUT_PATH = 'output/'
      
      
    2. content/Articles/KiraArticle.md:

      Slug: KiraArticle
      Title: KiraArticle
      Date: 2020-09-24 18:57:33
      
      Kira Goddess!
      
      
    3. .circleci/config.yml:

      version: 2.1
      
      jobs:
        build:
          machine:
            image: ubuntu-2204:current
          steps:
          - checkout
          - run: pyenv global 3.10.5
          - run: pip install pelican markdown
          - run: pip install pelican-search
          # [INFO] Non-interactive Rust installation on Ubuntu
          # https://stackoverflow.com/a/57251636/5951529
          - run: curl https://sh.rustup.rs -sSf | sh -s -- -y
          - run: cargo install stork-search --locked
          - run: stork --version
          - run: pelican content -s pelicanconf.py --fatal warnings --debug
          - run: ls output
          - run: cat output/search.toml
      
      

    3. Behavior before change

    If custom OUTPUT_PATH on Windows, Pelican Stork search generate invalid path slashes for the value of base_directory setting of search.toml file:

    [input]
    base_directory = "D:\SashaDemoRepositories\SashaPelicanDebugging\output"
    html_selector = "body"
    
    [[input.files]]
    path = "KiraArticle.html"
    url = "/KiraArticle.html"
    title = "KiraArticle"
    

    Incorrect TOML

    If I run:

    pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    

    I get an error:

    Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`
    

    Full output:

    D:\SashaDemoRepositories\SashaPelicanDebugging>pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    [11:23:23] DEBUG    Pelican version: 4.8.0                                                                                                                                                                                                                        __init__.py:531
               DEBUG    Python version: 3.10.6                                                                                                                                                                                                                        __init__.py:532
               DEBUG    Adding current directory to system path                                                                                                                                                                                                        __init__.py:66
               DEBUG    Finding namespace plugins                                                                                                                                                                                                                        _utils.py:81
               DEBUG    Namespace plugins found:                                                                                                                                                                                                                         _utils.py:84
                        pelican.plugins.search
                        pelican.plugins.sitemap
               DEBUG    Loading plugin `search`                                                                                                                                                                                                                          _utils.py:90
               DEBUG    Registering plugin `pelican.plugins.search`                                                                                                                                                                                                    __init__.py:73
               DEBUG    Found generator: ArticlesGenerator (internal)                                                                                                                                                                                                 __init__.py:209
               DEBUG    Found generator: PagesGenerator (internal)                                                                                                                                                                                                    __init__.py:209
               DEBUG    Found generator: SearchSettingsGenerator (pelican.plugins.search.search)                                                                                                                                                                      __init__.py:209
               DEBUG    Found generator: StaticGenerator (internal)                                                                                                                                                                                                   __init__.py:209
               DEBUG    Template list: ['!simple/archives.html', '!simple/article.html', '!simple/author.html', '!simple/authors.html', '!simple/base.html', '!simple/categories.html', '!simple/category.html', '!simple/gosquared.html', '!simple/index.html',     generators.py:70
                        '!simple/page.html', '!simple/pagination.html', '!simple/period_archives.html', '!simple/tag.html', '!simple/tags.html', '!simple/translations.html', '!theme/analytics.html', '!theme/archives.html', '!theme/article.html',
                        '!theme/article_infos.html', '!theme/author.html', '!theme/authors.html', '!theme/base.html', '!theme/categories.html', '!theme/category.html', '!theme/comments.html', '!theme/disqus_script.html', '!theme/github.html',
                        '!theme/index.html', '!theme/page.html', '!theme/period_archives.html', '!theme/tag.html', '!theme/taglist.html', '!theme/tags.html', '!theme/translations.html', '!theme/twitter.html', 'analytics.html', 'archives.html', 'article.html',
                        'article_infos.html', 'author.html', 'authors.html', 'base.html', 'categories.html', 'category.html', 'comments.html', 'disqus_script.html', 'github.html', 'gosquared.html', 'index.html', 'page.html', 'pagination.html',
                        'period_archives.html', 'tag.html', 'taglist.html', 'tags.html', 'translations.html', 'twitter.html']
               DEBUG    Read file Articles/KiraArticle.md -> Article                                                                                                                                                                                                   readers.py:547
               DEBUG    Signal article_generator_preread.send(ArticlesGenerator)                                                                                                                                                                                       readers.py:560
               DEBUG    Successfully imported extension module "markdown.extensions.meta".                                                                                                                                                                                core.py:163
               DEBUG    Successfully loaded extension "markdown.extensions.meta.MetaExtension".                                                                                                                                                                           core.py:126
               DEBUG    Signal article_generator_context.send(ArticlesGenerator, <metadata>)                                                                                                                                                                           readers.py:627 [11:23:24] DEBUG    Read file images/.keep -> Static                                                                                                                                                                                                               readers.py:547
               DEBUG    Signal static_generator_preread.send(StaticGenerator)                                                                                                                                                                                          readers.py:560
               DEBUG    Signal static_generator_context.send(StaticGenerator, <metadata>)                                                                                                                                                                              readers.py:627
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all.atom.xml                                                                                                                                                               writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/articles.atom.xml                                                                                                                                                          writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.atom.xml                                                                                                                                                    writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.rss.xml                                                                                                                                                     writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all-en.atom.xml                                                                                                                                                            writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/KiraArticle.html                                                                                                                                                                 writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/index.html                                                                                                                                                                       writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/tags.html                                                                                                                                                                        writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/categories.html                                                                                                                                                                  writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/authors.html                                                                                                                                                                     writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/archives.html                                                                                                                                                                    writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/category/articles.html                                                                                                                                                           writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/author/sasha-chernykh.html                                                                                                                                                       writers.py:212
               CRITICAL Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`                                               __init__.py:566
    
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:38 in build_search_index         โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚    35 โ”‚   โ”‚   if not which("stork"):                                                             โ”‚
    โ”‚    36 โ”‚   โ”‚   โ”‚   raise Exception("Stork must be installed and available on $PATH.")             โ”‚
    โ”‚    37 โ”‚   โ”‚   try:                                                                               โ”‚
    โ”‚ >  38 โ”‚   โ”‚   โ”‚   output = subprocess.run(                                                       โ”‚
    โ”‚    39 โ”‚   โ”‚   โ”‚   โ”‚   [                                                                          โ”‚
    โ”‚    40 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   "stork",                                                               โ”‚
    โ”‚    41 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   "build",                                                               โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚ C:\Python310\lib\subprocess.py:524 in run                                                        โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚    521 โ”‚   โ”‚   โ”‚   raise                                                                         โ”‚
    โ”‚    522 โ”‚   โ”‚   retcode = process.poll()                                                          โ”‚
    โ”‚    523 โ”‚   โ”‚   if check and retcode:                                                             โ”‚
    โ”‚ >  524 โ”‚   โ”‚   โ”‚   raise CalledProcessError(retcode, process.args,                               โ”‚
    โ”‚    525 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚    output=stdout, stderr=stderr)                        โ”‚
    โ”‚    526 โ”‚   return CompletedProcess(process.args, retcode, stdout, stderr)                        โ”‚
    โ”‚    527                                                                                           โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    CalledProcessError: Command '['stork', 'build', '--input', 'D:\\SashaDemoRepositories\\SashaPelicanDebugging\\output\\search.toml', '--output', 'D:\\SashaDemoRepositories\\SashaPelicanDebugging\\output/search-index.st']' returned non-zero exit status 1.
    
    During handling of the above exception, another exception occurred:
    
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ C:\Python310\lib\site-packages\pelican\__init__.py:562 in main                                   โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚   559 โ”‚   โ”‚   โ”‚   watcher = FileSystemWatcher(args.settings, Readers, settings)                  โ”‚
    โ”‚   560 โ”‚   โ”‚   โ”‚   watcher.check()                                                                โ”‚
    โ”‚   561 โ”‚   โ”‚   โ”‚   with console.status("Generatingโ€ฆ"):                                          โ”‚
    โ”‚ > 562 โ”‚   โ”‚   โ”‚   โ”‚   pelican.run()                                                              โ”‚
    โ”‚   563 โ”‚   except KeyboardInterrupt:                                                              โ”‚
    โ”‚   564 โ”‚   โ”‚   logger.warning('Keyboard interrupt received. Exiting.')                            โ”‚
    โ”‚   565 โ”‚   except Exception as e:                                                                 โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚ C:\Python310\lib\site-packages\pelican\__init__.py:127 in run                                    โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚   124 โ”‚   โ”‚                                                                                      โ”‚
    โ”‚   125 โ”‚   โ”‚   for p in generators:                                                               โ”‚
    โ”‚   126 โ”‚   โ”‚   โ”‚   if hasattr(p, 'generate_output'):                                              โ”‚
    โ”‚ > 127 โ”‚   โ”‚   โ”‚   โ”‚   p.generate_output(writer)                                                  โ”‚
    โ”‚   128 โ”‚   โ”‚                                                                                      โ”‚
    โ”‚   129 โ”‚   โ”‚   signals.finalized.send(self)                                                       โ”‚
    โ”‚   130                                                                                            โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:113 in generate_output           โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚   110 โ”‚   โ”‚   โ”‚   fd.write(search_settings)                                                      โ”‚
    โ”‚   111 โ”‚   โ”‚                                                                                      โ”‚
    โ”‚   112 โ”‚   โ”‚   # Build the search index                                                           โ”‚
    โ”‚ > 113 โ”‚   โ”‚   build_log = self.build_search_index(search_settings_path)                          โ”‚
    โ”‚   114 โ”‚   โ”‚   build_log = "".join(["Search plugin reported ", build_log])                        โ”‚
    โ”‚   115 โ”‚   โ”‚   logger.error(build_log) if "error" in build_log else logger.debug(build_log)       โ”‚
    โ”‚   116                                                                                            โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:52 in build_search_index         โ”‚
    โ”‚                                                                                                  โ”‚
    โ”‚    49 โ”‚   โ”‚   โ”‚   โ”‚   check=True,                                                                โ”‚
    โ”‚    50 โ”‚   โ”‚   โ”‚   )                                                                              โ”‚
    โ”‚    51 โ”‚   โ”‚   except subprocess.CalledProcessError as e:                                         โ”‚
    โ”‚ >  52 โ”‚   โ”‚   โ”‚   raise Exception("".join(["Search plugin reported ", e.stdout, e.stderr]))      โ”‚
    โ”‚    53 โ”‚   โ”‚                                                                                      โ”‚
    โ”‚    54 โ”‚   โ”‚   return output.stdout                                                               โ”‚
    โ”‚    55                                                                                            โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`
    

    4. Change

    I applied os.sep to convert back-slashes to forward-slashes. I change the line of search.py:

    - self.output_path = output_path
    + self.output_path = output_path.replace(os.sep, '/')
    

    5. Behavior after change

    search.toml on Windows after my changes:

    - base_directory = "D:\SashaDemoRepositories\SashaPelicanDebugging\output"
    + base_directory = "D:/SashaDemoRepositories/SashaPelicanDebugging/output"
    

    This is the correct path for Windows. No errors in output.

    D:\SashaDemoRepositories\SashaPelicanDebugging>pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    [11:30:32] DEBUG    Pelican version: 4.8.0                                                                                                                                                                                                                        __init__.py:531
               DEBUG    Python version: 3.10.6                                                                                                                                                                                                                        __init__.py:532
               DEBUG    Adding current directory to system path                                                                                                                                                                                                        __init__.py:66
               DEBUG    Finding namespace plugins                                                                                                                                                                                                                        _utils.py:81
               DEBUG    Namespace plugins found:                                                                                                                                                                                                                         _utils.py:84
                        pelican.plugins.search
                        pelican.plugins.sitemap
               DEBUG    Loading plugin `search`                                                                                                                                                                                                                          _utils.py:90
               DEBUG    Registering plugin `pelican.plugins.search`                                                                                                                                                                                                    __init__.py:73
               DEBUG    Found generator: ArticlesGenerator (internal)                                                                                                                                                                                                 __init__.py:209
               DEBUG    Found generator: PagesGenerator (internal)                                                                                                                                                                                                    __init__.py:209
               DEBUG    Found generator: SearchSettingsGenerator (pelican.plugins.search.search)                                                                                                                                                                      __init__.py:209
               DEBUG    Found generator: StaticGenerator (internal)                                                                                                                                                                                                   __init__.py:209
               DEBUG    Template list: ['!simple/archives.html', '!simple/article.html', '!simple/author.html', '!simple/authors.html', '!simple/base.html', '!simple/categories.html', '!simple/category.html', '!simple/gosquared.html', '!simple/index.html',     generators.py:70
                        '!simple/page.html', '!simple/pagination.html', '!simple/period_archives.html', '!simple/tag.html', '!simple/tags.html', '!simple/translations.html', '!theme/analytics.html', '!theme/archives.html', '!theme/article.html',
                        '!theme/article_infos.html', '!theme/author.html', '!theme/authors.html', '!theme/base.html', '!theme/categories.html', '!theme/category.html', '!theme/comments.html', '!theme/disqus_script.html', '!theme/github.html',
                        '!theme/index.html', '!theme/page.html', '!theme/period_archives.html', '!theme/tag.html', '!theme/taglist.html', '!theme/tags.html', '!theme/translations.html', '!theme/twitter.html', 'analytics.html', 'archives.html', 'article.html',
                        'article_infos.html', 'author.html', 'authors.html', 'base.html', 'categories.html', 'category.html', 'comments.html', 'disqus_script.html', 'github.html', 'gosquared.html', 'index.html', 'page.html', 'pagination.html',
                        'period_archives.html', 'tag.html', 'taglist.html', 'tags.html', 'translations.html', 'twitter.html']
               DEBUG    Read file Articles/KiraArticle.md -> Article                                                                                                                                                                                                   readers.py:547
               DEBUG    Signal article_generator_preread.send(ArticlesGenerator)                                                                                                                                                                                       readers.py:560
               DEBUG    Successfully imported extension module "markdown.extensions.meta".                                                                                                                                                                                core.py:163
               DEBUG    Successfully loaded extension "markdown.extensions.meta.MetaExtension".                                                                                                                                                                           core.py:126
               DEBUG    Signal article_generator_context.send(ArticlesGenerator, <metadata>)                                                                                                                                                                           readers.py:627
               DEBUG    Read file images/.keep -> Static                                                                                                                                                                                                               readers.py:547
               DEBUG    Signal static_generator_preread.send(StaticGenerator)                                                                                                                                                                                          readers.py:560
               DEBUG    Signal static_generator_context.send(StaticGenerator, <metadata>)                                                                                                                                                                              readers.py:627
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all.atom.xml                                                                                                                                                               writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/articles.atom.xml                                                                                                                                                          writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.atom.xml                                                                                                                                                    writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.rss.xml                                                                                                                                                     writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all-en.atom.xml                                                                                                                                                            writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/KiraArticle.html                                                                                                                                                                 writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/index.html                                                                                                                                                                       writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/tags.html                                                                                                                                                                        writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/categories.html                                                                                                                                                                  writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/authors.html                                                                                                                                                                     writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/archives.html                                                                                                                                                                    writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/category/articles.html                                                                                                                                                           writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/author/sasha-chernykh.html                                                                                                                                                       writers.py:212
               DEBUG    Search plugin reported                                                                                                                                                                                                                          search.py:118
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\fonts.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\fonts.css                                                                                utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\main.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\main.css                                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\pygment.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\pygment.css                                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\reset.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\reset.css                                                                                utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\typogrify.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\typogrify.css                                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\wide.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\wide.css                                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\font.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\font.css                                                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.eot to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.eot                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.svg to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.svg                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.ttf to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.ttf                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.woff to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.woff                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.woff2 to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.woff2                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\aboutme.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\aboutme.png                                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\bitbucket.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\bitbucket.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\delicious.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\delicious.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\facebook.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\facebook.png                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\github.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\github.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\gitorious.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\gitorious.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\gittip.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\gittip.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\google-groups.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\google-groups.png                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\google-plus.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\google-plus.png                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\hackernews.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\hackernews.png                                                    utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\lastfm.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\lastfm.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\linkedin.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\linkedin.png                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\reddit.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\reddit.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\rss.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\rss.png                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\slideshare.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\slideshare.png                                                    utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\speakerdeck.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\speakerdeck.png                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\stackoverflow.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\stackoverflow.png                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\twitter.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\twitter.png                                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\vimeo.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\vimeo.png                                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\youtube.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\youtube.png                                                          utils.py:332
               INFO     Copying D:\SashaDemoRepositories\SashaPelicanDebugging\content\images\.keep to D:\SashaDemoRepositories\SashaPelicanDebugging\output\images\.keep                                                                                                utils.py:302
               INFO     Copying D:\SashaDemoRepositories\SashaPelicanDebugging\content\images\.keep to images/.keep                                                                                                                                                 generators.py:906 Done: Processed 1 article, 0 drafts, 0 hidden articles, 0 pages, 0 hidden pages and 0 draft pages in 0.56 seconds
    

    6. UNIX possible consequences

    My change shouldnโ€™t affect *nix operating systems. I check it on Circle CI.

    1. Circle CI build with configuration from the item 2.3 of this issue, generated search.toml:

      [input]
      base_directory = "/home/circleci/project/output"
      html_selector = "body"
      
      [[input.files]]
      path = "KiraArticle.html"
      url = "/KiraArticle.html"
      title = "KiraArticle"
      
    2. I change in my config.yml:

      - - run: pip install pelican-search
      + - run: pip install git+https://github.com/Kristinita/[email protected]
      

    Circle CI build, the same search.toml.

    I didnโ€™t see anything changed in Ubuntu build after my change.

    7. Reproducing problem

    I canโ€™t reproduce my problem on free remote CI services. Unfortunately, installing Stork on Windows isnโ€™t quick. To install Stork, a Windows user must install Rust and compile Stork on his own machine. I can to compile Stork on my machine, but I was getting bugs on Circle CI and AppVeyor CI.

    If you know how to compile Stork for Windows on free remote CI services, please, tell me. See also my issue on the Stork issue tracker.

    8. Environment

    1. Operating system:

      1. Local โ€” Microsoft Windows [Version 10.0.19041.1415]
      2. Circle CI โ€” Ubuntu 22.04 LTS (Jammy Jellyfish)
    2. Python โ€” 3.10.5, 3.10.6

    3. Pelican โ€” 4.8.0

    4. Stork โ€” 1.5.0

    5. pelican-search โ€” 1.0.1

    Thanks.

    opened by Kristinita 0
Iptvcrawl - A scrapy project for crawl IPTV playlist

iptvcrawl a scrapy project for crawl IPTV playlist. Dependency Python3 pip insta

Zhijun 18 May 05, 2022
Open Crawl Vietnamese Text

Open Crawl Vietnamese Text This repo contains crawled Vietnamese text from multiple sources. This list of a topic-centric public data sources in high

QAI Research 4 Jan 05, 2022
Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye, you can search with various keywords and usernames on Twitter.

Jolanda de Koff 19 Dec 12, 2022
An Automated udemy coupons scraper which scrapes coupons and autopost the result in blogspot post

Autoscraper-n-blogger An Automated udemy coupons scraper which scrapes coupons and autopost the result in blogspot post and notifies via Telegram bot

GOKUL A.P 13 Dec 21, 2022
Simple python tool for the purpose of swapping latinic letters with cirilic ones and vice versa in txt, docx and pdf files in Serbian language

Alpha Swap English This is a simple python tool for the purpose of swapping latinic letters with cirylic ones and vice versa, in txt, docx and pdf fil

Aleksandar Damnjanovic 3 May 31, 2022
This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

AccessibilityLU 7 Sep 30, 2022
ไธ€ไธชm3u8่ง†้ข‘ๆตไธ‹่ฝฝ่„šๆœฌ

ไธ€ไธชPython็š„m3u8ๆต่ง†้ข‘ไธ‹่ฝฝ่„šๆœฌ ไป‹็ป m3u8ๆต่ง†้ข‘ๆ—ฅ็›Šๅธธ่ง๏ผŒ็›ฎๅ‰ๅฅฝ็”จ็š„ไธ‹่ฝฝๅ™จไนŸๆœ‰ๅพˆๅคš๏ผŒๆˆ‘ๆŠŠไน‹ๅ‰่‡ชๅทฑๅ†™็š„ไธ€ไธชๅฐ่„šๆœฌๅˆ†ไบซๅ‡บๆฅ๏ผŒไพ›ๅนฟๅคง็ฝ‘ๅ‹ไฝฟ็”จใ€‚ๅ†™ๆญค็จ‹ๅบ็š„็›ฎ็š„ๅœจไบŽ็ป™่ง†้ข‘ไธ‹่ฝฝ็ˆฑๅฅฝ่€…ๆไพ›ไธ€ไธชไธ‹่ฝฝๆ ทไพ‹๏ผŒๅฏ็›ดๆŽฅ่ฐƒ็”จ๏ผŒๅ‹ฟๅ†้‡ๅค้€ ่ฝฎๅญใ€‚ ไฝฟ็”จๆ–นๆณ• ๅœจpythonไธญ็›ดๆŽฅ่ฟ่กŒ็จ‹ๅบๆˆ–่ฟ›่กŒๅค–้ƒจ่ฐƒ็”จ import

Nchu 0 Oct 10, 2021
๐Ÿ‘๏ธ Tool for Data Extraction and Web Requests.

httpmapper ๐Ÿ‘๏ธ Project โ€ข Technologies โ€ข Installation โ€ข How it works โ€ข License Project ๐Ÿšง For educational purposes. This is a project that I developed,

15 Dec 05, 2021
Crawl the information of a given keyword on Google search engine

Crawl the information of a given keyword on Google search engine

4 Nov 09, 2022
A tool for scraping and organizing data from NewsBank API searches

nbscraper Overview This simple tool automates the process of copying, pasting, and organizing data from NewsBank API searches. Curerntly, nbscrape onl

0 Jun 17, 2021
CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022
Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Toxicity comments crawler Crawler job that scrapes comments from social media posts and saves them in a S3 bucket. Twitter Tweets and replies are scra

Douglas Trajano 2 Jan 24, 2022
A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.

Universal Online Judge Spider Introduction This is a spider for Universal Online Judge (UOJ) system (https://uoj.ac/). It also works for all other Onl

TriNitroTofu 1 Dec 07, 2021
OSTA web scraper, for checking the status of school buses in Ottawa

OSTA-La-Vista OSTA web scraper, for checking the status of school buses in Ottawa. Getting Started Using a Raspberry Pi, download Python 3, and option

1 Jan 28, 2022
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submit

14 Dec 09, 2022
a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

George Reyes 7 Nov 27, 2022
ไธ€ไบ›็ˆฌ่™ซ็›ธๅ…ณ็š„็ญพๅใ€้ชŒ่ฏ็ ็ ด่งฃ

cracking4crawling ไธ€ไบ›็ˆฌ่™ซ็›ธๅ…ณ็š„็ญพๅใ€้ชŒ่ฏ็ ็ ด่งฃ๏ผŒ็›ฎๅ‰ๅทฒๆœ‰่„šๆœฌ๏ผš ๅฐ็บขไนฆAppๆŽฅๅฃ็ญพๅ๏ผˆshield๏ผ‰๏ผˆ2020.12.02๏ผ‰ ๅฐ็บขไนฆๆป‘ๅ—๏ผˆๆ•ฐ็พŽ๏ผ‰้ชŒ่ฏ็ ด่งฃ๏ผˆ2020.12.02๏ผ‰ ๆตทๅ—่ˆช็ฉบAppๆŽฅๅฃ็ญพๅ๏ผˆhnairSign๏ผ‰๏ผˆ2020.12.05๏ผ‰ ่ฏดๆ˜Ž๏ผš ่„šๆœฌๆŒ‰็›ฎๆ ‡็ฝ‘็ซ™ใ€Appๅ‘ฝ

XNFA 90 Feb 09, 2021
A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https

Geminid Systems, Inc 6 Aug 10, 2022
Screenhook is a script that captures an image of a web page and send it to a discord webhook.

screenshot from the web for discord webhooks screenhook is a script that captures an image of a web page and send it to a discord webhook.

Toast Energy 3 Jun 04, 2022
Scraping weather data using Python to receive umbrella reminders

A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.

Edula Vinay Kumar Reddy 1 Aug 23, 2022