Your self hosted Youtube media server

Overview

banner-tube-archivist-light.png

The Tube Archivist
Your self hosted Youtube media server

Core functionality

  • Subscribe to your favourite Youtube channels
  • Download Videos using yt-dlp
  • Index and make videos searchable
  • Play videos
  • Keep track of viewed and unviewed videos

Screenshots

home screenshot
Home Page

channels screenshot
All Channels

single channel screenshot
Single Channel

video page screenshot
Video Page

video page screenshot
Downloads Page

Problem Tube Archivist tries to solve

Once your Youtube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from Youtube, you can organize, search and enjoy your archived Youtube videos without hassle offline through a convenient web interface.

Installation

Take a look at the example docker-compose.yml file provided. Tube Archivist depends on three main components split up into seperate docker containers:

Tube Archivist

The main Python application that displays and serves your video collection, built with Django.

  • Serves the interface on port 8000
  • Needs a mandatory volume for the video archive at /youtube
  • And another recommended volume to save the cache for thumbnails and artwork at /cache.
  • The environment variables ES_URL and REDIS_HOST are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
  • The environment variables HOST_UID and HOST_GID allowes Tube Archivist to chown the video files to the main host system user instead of the container user.

Elasticsearch

Stores video meta data and makes everything searchable. Also keeps track of the download queue.

  • Needs to be accessable over the default port 9200
  • Needs a volume at /usr/share/elasticsearch/data to store data

Follow the documentation for additional installation details.

Redis JSON

Functions as a cache and temporary link between the application and the filesystem. Used to store and display messages and configuration variables.

  • Needs to be accessable over the default port 6379
  • Takes an optional volume at /data to make your configuration changes permanent.

Getting Started

  1. Go through the settings page and look at the available options. Particularly set Download Format to your desired video quality before downloading.
  2. Subscribe to some of your favourite Youtube channels on the channels page.
  3. On the downloads page, click on Rescan subscriptions to add videos from the subscribed channels to your Download queue or click on Add to download queue to manually add Video IDs, links, channels or playlists.
  4. Click on Download queue and let Tube Archivist to it's thing.
  5. Enjoy your archived collection!

Potential pitfalls

Elastic Search in Docker requires the kernel setting of the host machine vm.max_map_count to be set to least 262144.

To temporary set the value run:

sudo sysctl -w vm.max_map_count=262144

To apply the change permanently depends on your host operating system:

  • For example on Ubuntu Server add vm.max_map_count = 262144 to the file /etc/sysctl.conf.
  • On Arch based systems create a file /etc/sysctl.d/max_map_count.conf with the content vm.max_map_count = 262144.
  • On any other platform look up in the documentation on how to pass kernel parameters.

Roadmap

This should be considered as a minimal viable product, there is an exstensive list of future functions and improvements planned:

  • Scan your filesystem to manually add videos
  • Access controll
  • User roles
  • Delete videos and channel
  • Create playlists
  • Show similar videos on video page
  • Import existing downloaded archive
  • Multi language support
  • Backup and restore

Known limitations

  • Video files created by Tube Archivist need to be mp4 video files for best browser compatibility.
  • Every limitation of yt-dlp will also be present in Tube Archivist. If yt-dlp can't download or extract a video for any reason, Tube Archivist won't be able to either.
  • For now this is meant to be run in a trusted network environment.
Comments
  • Multi-arch images?

    Multi-arch images?

    Thanks for working on this, it is a promising project!

    I was trying to spin-up tubearchivist on a RPi 4 running 64-bit Raspbian OS and while the containers get built fine, I run into the standard_init_linux.go:228: exec user process caused: exec format error for the redis JSON and the tubearchivist containers (the Elasticsearch seems to install fine).

    Is there a plan for building multi-arch images especially for arm64?

    opened by abhilesh 27
  • update v0.2 support thread: check the release notes and the readme.

    update v0.2 support thread: check the release notes and the readme.

    Hi,

    What is your error? After the last upgrade the app won't launch

    How to reproduce? Just update the app

    [archivist-redis]: ok

    [archivist-es]: Capture d’écran du 2022-07-23 15-24-54

    SOLUTION: chown 1000:0 /path/to/mount/point of elasticsearch

    [tubearchivist]: Capture d’écran du 2022-07-23 15-26-25

    SOLUTION: add TA_HOST in the yml

    TA_HOST=YOUR_IP or TA_HOST=YOUR_DOMAIN

    Like this:

    Capture d’écran du 2022-07-23 16-53-43

    Have a nice day!

    documentation 
    opened by zarevskaya 25
  • Add LDAP attribute mapping env variables.

    Add LDAP attribute mapping env variables.

    When using a default Samba DC LDAP instance, uid isn't used to hold the username, so for this, and other LDAP implementations, it's necessary to be able to specify which LDAP attributes are actually used for first name, last name, username, email, etc.

    This doesn't change the default behavior, as it uses the current hardcoded values as the default values instead of requiring admins to specify it if they are upgrading to a version containing this feature.

    opened by BrianCArnold 20
  • Get Video Player Data Using New API

    Get Video Player Data Using New API

    The videoPlayer() function now gets it's data from the API rather then the HTML. It still pulls the video id from the button. 3 functions were also added getVideoPlayerData(), getVideoData(), and apiRequest(). apiRequest() makes an api request when passed an endpoint (ex. /api/video/VIDEO_ID/player/) and a method (either "GET" or "POST") and returns the results in JSON. getVideoPlayerData() returns video player data in JSON when it is given a video ID (Makes a call to apiRequest() ). getVideoData() isn't used right now, it's just another example and returns video data in JSON when it is given a video ID (It also makes a call to apiRequest().

    opened by n8detar 20
  • [Bug]: Synology error seccomp unavailable when starting Elasticsearch

    [Bug]: Synology error seccomp unavailable when starting Elasticsearch

    Latest and Greatest

    • [X] I'm running the latest version of Tube Archivist and have read the release notes.

    Operating System

    Synology

    Your Bug Report

    Describe the bug

    Hello, I want to thank you for creating this project. I already solved the redis issue. Archivist-es is continuously restarting. I will provide a log from synology docker here in a second. I primarily use portainer to manage the containers but since elasticsearch keeps bootlooping, the container log page won't even load on portainer.

    Steps To Reproduce

    attempting to start the containers

    Expected behavior

    archivist-es should run and continuously stay active instead of restarting itself every few seconds and occupying memory usage. archivist-es.csv

    Relevant log output

    version: '3.3'
    
    services:
      tubearchivist:
        container_name: tubearchivist
        restart: unless-stopped
        image: bbilly1/tubearchivist
        ports:
          - 8100:8000
        volumes:
          - /volume3/TA_Creators:/youtube
          - /volume3/docker/tubearchivist/cache:/cache
        environment:
          - ES_URL=http://10.10.0.215:9200
          - REDIS_HOST=archivist-redis
          - HOST_UID=1024
          - HOST_GID=100
          - TA_HOST=10.10.0.215
          - TA_PASSWORD=REDACTED
          - ELASTIC_PASSWORD=REDACTED
          - TZ=EST
        depends_on:
          - archivist-es
          - archivist-redis
      archivist-redis:
        image: redislabs/rejson
        container_name: archivist-redis
        restart: unless-stopped
        expose:
          - "6379"
        volumes:
          - /volume3/docker/tubearchivist/redis:/data
        depends_on:
          - archivist-es
      archivist-es:
        image: bbilly1/tubearchivist-es
        container_name: archivist-es
        restart: unless-stopped
        environment:
          - "xpack.security.enabled=true"
          - "discovery.type=single-node"
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
          - ELASTIC_PASSWORD=REDACTED
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - /volume3/docker/tubearchivist/es:/usr/share/elasticsearch/data
        expose:
          - "9200"
    

    Anything else?

    I attached my log output above since that's the only place where I could and because it only allowed me to export as a formatted csv file from docker. Apologies and hopefully the solution is easily apparent.

    question 
    opened by N72826 17
  • [docker] initial superuser created every time container starts

    [docker] initial superuser created every time container starts

    So it appears that the superuser is created on container startup every time based on variables TA_USERNAME and TA_PASSWORD, even if:

    • the container is started for the non-first time
    • even if another superuser exists
    • even if the initial user is manually deleted using the admin interface

    One would presume from the wording on README, i.e. Change the environment variables TA_USERNAME and TA_PASSWORD to create the initial credentials and the example compose file, i.e. your initial TA credentials that it's only created when the container is run for the first time and/or if no other superuser exists. I believe this should be more clear in the README, and/or make the app not create a superuser if it's not the first time being run

    enhancement 
    opened by kzshantonu 17
  • Playing downloads on Safari

    Playing downloads on Safari

    Hey there – I am finding that downloaded videos won't play in Safari browsers (Mac or iOS). This seems to be an issue with how the webserver provides 'range' data to clients. This is beyond my expertise to fix. Any thoughts?

    Oh…works great on Chrome though :)

    opened by deanpribetic 17
  • [Bug] Autodelete unreliable in v1.1.3

    [Bug] Autodelete unreliable in v1.1.3

    I run TubeArchivist in Docker on a Synology NAS (DS918+, DSM 7.0.1-42218 Update 3).

    This is the Compose i use:

    services:
      tubearchivist_julian:
        image: bbilly1/tubearchivist:latest
        container_name: tubearchivist_julian
        volumes:
          - ./app:/cache
          - /volume1/media/youtube/julian:/youtube
        environment:
          TZ: Europe/Berlin
          ES_URL: http://tubearchivist_julian_es:9200
          REDIS_HOST: tubearchivist_julian_redis
          HOST_UID: 1026
          HOST_GID: 101
          TA_USERNAME: tube_js
          TA_PASSWORD: ${TA_PASS_JULIAN}
          ELASTIC_PASSWORD: ${TA_ELASTIC_PASS}
        ports:
          - 18000:8000
        depends_on:
          - tubearchivist_julian_es
          - tubearchivist_julian_redis
        restart: unless-stopped
      tubearchivist_julian_redis:
        image: redislabs/rejson:latest
        container_name: tubearchivist_julian_redis
        volumes:
          - ./redis:/data
        ports:
          - 6379:6379
        depends_on:
          - tubearchivist_julian_es
        restart: unless-stopped
      tubearchivist_julian_es:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.17.1
        container_name: tubearchivist_julian_es
        volumes:
          - ./es:/usr/share/elasticsearch/data
        environment:
          - "xpack.security.enabled=true"
          - "ELASTIC_PASSWORD=${TA_ELASTIC_PASS}"
          - "discovery.type=single-node"
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        ports:
          - 9200:9200
        restart: unless-stopped 
    

    My relevant settings in TubeArchivist are: Subscriptions - Page Size - 5 Downloads - Auto Delete - True Scheduler - Rescan - 0 */4,11-20 * Scheduler - Start Download - 5 */4,11-20 *

    Since I updated to TA 1.3.0 last weekend some of my videos I mark as watched and that get deleted will get redownloaded again the next day. When asked on Discord I could not find a redownloaded video in the ignore list. This was also reported to be a bug on Unraid on Discord by somethingsuper.

    bug question 
    opened by cpt-kuesel 16
  • Some more url formats would be nice

    Some more url formats would be nice

    Channel links with the Channel name don't work:

    tubearchivist      | {'csrfmiddlewaretoken': ['AY0tO0Dav2zXgzjttPYb4lEsQ5qgYn5NE4Fzk66r983FaufAOvSgNKSTb3mBAUFQ'], 'subscribe': ['https://www.youtube.com/c/veritasium']}
    tubearchivist      | parsing subscribe ids failed!
    tubearchivist      | ['https://www.youtube.com/c/veritasium']
    

    As a workaround I currently copy the link from the channel name when watching a video. This link has the neded channel id.

    Playlist links in this format: https://www.youtube.com/watch?v=aFPJf-wKTd0&list=UUHnyfMqiRRG1u-2MsSQLbXA&index=2 are parsed like one video but not as list.

    opened by MSDev201 16
  • Feature Request: Allow use of cookies.txt file to pass to YT-DLP

    Feature Request: Allow use of cookies.txt file to pass to YT-DLP

    Just in case YT wants to be picky about age restricted, etc, files - be nice to be able to point to a cookies.txt file in the tubearchivist appdata folder that it can pass to YT-DLP

    enhancement 
    opened by Marthisdil 15
  • Videos wont play

    Videos wont play

    Ive left all setting at default. a couple things are happening

    It seems dls are stalling and i have to hit download que multiple time to get it going again. nothing helpfule in logs

    when playing a video in firefox (linux) i get no video with supported format and mimetime found. on chrome i get no controls and or anything, just a screen with the beginning videothumnail i assume

    opened by Code-Slave 14
  • [Bug]: Change in SponsorBlock API breaks integration

    [Bug]: Change in SponsorBlock API breaks integration

    I've read the documentation

    Operating System

    linux

    Your Bug Report

    Describe the bug

    Sponsorblock changed their API, breaking our integration.

    Steps To Reproduce

    Activate integration on settings page and download any video that has segments registered in sponsorblock.

    Expected behavior

    Clean output and store relevant fields.

    Relevant log output

    tubearchivist  | [2022-12-31 08:06:59,973: WARNING/ForkPoolWorker-4] xxxxxxxxxxx: get sponsorblock timestamps
    tubearchivist  | [2022-12-31 08:07:00,453: ERROR/ForkPoolWorker-4] Task download_pending[925b2a8e-9cd6-493a-bdcf-e7f0aaf20a4f] raised unexpected: KeyError('userID')
    tubearchivist  | Traceback (most recent call last):
    tubearchivist  |   File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
    tubearchivist  |     R = retval = fun(*args, **kwargs)
    tubearchivist  |   File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
    tubearchivist  |     return self.run(*args, **kwargs)
    tubearchivist  |   File "/app/home/tasks.py", line 90, in download_pending
    tubearchivist  |     downloader.run_queue()
    tubearchivist  |   File "/app/home/src/download/yt_dlp_handler.py", line 210, in run_queue
    tubearchivist  |     vid_dict = index_new_video(
    tubearchivist  |   File "/app/home/src/index/video.py", line 405, in index_new_video
    tubearchivist  |     video.build_json()
    tubearchivist  |   File "/app/home/src/index/video.py", line 158, in build_json
    tubearchivist  |     self._get_sponsorblock()
    tubearchivist  |   File "/app/home/src/index/video.py", line 359, in _get_sponsorblock
    tubearchivist  |     sponsorblock = SponsorBlock().get_timestamps(self.youtube_id)
    tubearchivist  |   File "/app/home/src/index/video.py", line 70, in get_timestamps
    tubearchivist  |     sponsor_dict = self._get_sponsor_dict(all_segments)
    tubearchivist  |   File "/app/home/src/index/video.py", line 81, in _get_sponsor_dict
    tubearchivist  |     del segment["userID"]
    tubearchivist  | KeyError: 'userID'
    

    Anything else?

    No response

    bug 
    opened by bbilly1 1
  • [Bug]: Thumbnail downloading error blocks video downloading

    [Bug]: Thumbnail downloading error blocks video downloading

    I've read the documentation

    Operating System

    Docker in Ubuntu 20, kernel 5.4.0-125-generic

    Your Bug Report

    Describe the bug

    Unable to download single video. Now messages/errors in UI.

    Steps To Reproduce

    1. Add video with alias BFOSuMc3hDc
    2. Press "Download now"

    Expected behavior

    Video downloading succeeded

    Relevant log output

    [2022-12-30 23:39:39,829: INFO/MainProcess] Task home.tasks.download_single[5a4e8959-d092-4ccd-88f8-79d1ff21089c] received
    [2022-12-30 23:39:39,833: WARNING/ForkPoolWorker-4] Added to queue with priority: BFOSuMc3hDc
    [2022-12-30 23:39:46,558: WARNING/ForkPoolWorker-4] BFOSuMc3hDc: get metadata from youtube
    [2022-12-30 23:39:47,511: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: get metadata from es
    [2022-12-30 23:39:47,521: WARNING/ForkPoolWorker-4] {"_index":"ta_channel","_id":"UCd_sTwKqVrweTt4oAKY5y4w","found":false}
    [2022-12-30 23:39:47,522: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: scrape channel data from youtube
    [2022-12-30 23:39:47,708: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: download channel thumbnail
    [2022-12-30 23:39:52,755: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
    [2022-12-30 23:39:58,804: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
    [2022-12-30 23:40:05,843: WARNING/ForkPoolWorker-4] UCd_sTwKqVrweTt4oAKY5y4w: retry thumbnail download https://yt3.ggpht.com/3ZWqY8gwuMEl6e2oV6WPMmJyPCAG3i_lL4malTRk8xUWtwGU54wLLJT4H6QdP8bB13ybkAuRVbM=s900-c-k-c0x00ffffff-no-rj
    [2022-12-30 23:40:14,854: ERROR/ForkPoolWorker-4] Task home.tasks.download_single[5a4e8959-d092-4ccd-88f8-79d1ff21089c] raised unexpected: AttributeError("'bool' object has no attribute 'convert'")
    Traceback (most recent call last):
      File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
        R = retval = fun(*args, **kwargs)
      File "/root/.local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
        return self.run(*args, **kwargs)
      File "/app/home/tasks.py", line 120, in download_single
        VideoDownloader().run_queue()
      File "/app/home/src/download/yt_dlp_handler.py", line 210, in run_queue
        vid_dict = index_new_video(
      File "/app/home/src/index/video.py", line 405, in index_new_video
        video.build_json()
      File "/app/home/src/index/video.py", line 150, in build_json
        self._add_channel()
      File "/app/home/src/index/video.py", line 204, in _add_channel
        channel.build_json(upload=True, fallback=self.youtube_meta)
      File "/app/home/src/index/channel.py", line 183, in build_json
        self.get_from_youtube(fallback)
      File "/app/home/src/index/channel.py", line 199, in get_from_youtube
        self.get_channel_art()
      File "/app/home/src/index/channel.py", line 245, in get_channel_art
        ThumbManager(self.youtube_id, item_type="channel").download(urls)
      File "/app/home/src/download/thumbnails.py", line 99, in download
        self.download_channel_art(url)
      File "/app/home/src/download/thumbnails.py", line 149, in download_channel_art
        self._download_channel_thumb(channel_thumb, skip_existing)
      File "/app/home/src/download/thumbnails.py", line 164, in _download_channel_thumb
        img_raw.convert("RGB").save(thumb_path)
    AttributeError: 'bool' object has no attribute 'convert'
    

    Anything else?

    No response

    question 
    opened by AlekseyLobanov 5
  • [Feature Request]: Refactor Docker image to use s6-overlay

    [Feature Request]: Refactor Docker image to use s6-overlay

    Already implemented?

    Your Feature Request

    Is your feature request related to a problem? Please describe.

    The docker image runs as root

    Describe the solution you'd like

    Since there are multiple processes in your image, it would be nice to refactor the current docker image to use s6-overlay. A good example that somewhat resembles this project would be to look at Funkwhales AIO docker image. It's also a Python application and uses nginx and celery

    Additional context

    Your help is needed!

    • [ ] Yes I can help with this feature request!
    enhancement 
    opened by onedr0p 1
  • [Feature Request]: custom metadata - turn off syncing metadata

    [Feature Request]: custom metadata - turn off syncing metadata

    Already implemented?

    Your Feature Request

    Short Description (Metadata Sync)

    Stop syncing metadata on a per video perspective. This means, i want to exclude videos by my own decision. Its already implemented for deactivated and outdated videos.

    Please let users also a simple 'true/false' choice.

    Short Description (Metadata Edit)

    • Choose the source of metadata, (yt | custom/local )
    • Define read only fields respective fields to sync
    • TA WebUI CRUD only for metadata. (without Create)
    • modern Inline editing, yeah nice.

    Additional context

    Refreshing Metadata is a good starting point, to get a synced version of your favorites. On the other side, you may have expirienced yourself, sometimes information are not present long time. And, imho an archive should prevent this situation. Is there any way, any kind of history, changelog something like that? What happens with the json.info after creating,indexing all the stuff? Could it placed somewhere, as binary in some db .. ?

    Thanks for your time! Sorry if i went into too much detail. my real life job rubs off sometimes 👍

    Your help is needed!

    • [X] Yes I can help with this feature request!
    enhancement help wanted 
    opened by cmuc24 2
  • [Feature Request]: RSS feed downloader

    [Feature Request]: RSS feed downloader

    Already implemented?

    Your Feature Request

    Is your feature request related to a problem? Please describe.

    Sometimes channels delete videos shortly after uploading and scheduling a refresh of channels every hour isn’t great from a rate limit standpoint.

    Describe the solution you'd like

    Using an RSS feed to get notified of new uploads could trigger TA to search for only channels that have new videos. I found this article talking about how to get an RSS feed from YouTube: https://danielmiessler.com/blog/rss-feed-youtube-channel/

    Additional context

    I’m assuming this would require significant restructuring of the scheduler/download function, however, with this method it should “speed” up TAs downloading and refreshing because it would only look at new videos on channels, without having to go through everything.

    Your help is needed!

    • [ ] Yes I can help with this feature request!
    enhancement question 
    opened by pairofcrocs 1
  • [Bug]:

    [Bug]: "Add to download queue" defaults to adding an entire channel if a short video URL is added. Downloading an entire channel does not download shorts.

    I've read the documentation

    Operating System

    Fedora 36, Docker, Latest image tag.

    Your Bug Report

    Describe the bug

    The "Add to download queue" function in the Downloads page performs a few checks to determine if the URL provided is a video, playlist, or channel. If the URL does not match any of these checks, the URL is defaulted to a channel. As a result, if a shorts URL is used (example https://www.youtube.com/shorts/6QImkSXqwao), the detect_from_url function in helper.py will not correctly set the URL as a video, because there is no if statement to handle a URL matching "/shorts/".

    Since there is no check for shorts in the URL, the default option on line 195 will return download type as a channel. When the entire channel is processed, all regular videos are add to the queue, but the shorts are not added since there is no processing for shorts in get_last_youtube_videos.

    ~~Currently the only way to download a short is to convert the URL to a standard video by replaceing the /shorts/ part of the URL with /watch?v=~~ edit: shorts can be downloaded by video ID alone per the wiki.

    When adding an entire channel to the download queue, short videos are not included becuase get_last_youtube_videos function in subscriptions.py is only checking for videos in https://www.youtube.com/channel/{channel_id}/videos.

    Adding a shorts channel page, example https://www.youtube.com/@gensho_yasuda/shorts only loads normal videos.

    As a result, it is not possible to download a short using the standard shorts URL, add an entire channel to download normal videos and short videos together, or subscribe to a channel to download new shorts. This may apply to streams as well though I did not test this.

    Steps To Reproduce

    1. Click on "Add to download queue"
    2. Add the following short video path
    https://www.youtube.com/shorts/kHQCJDo_RzI
    
    1. Click the "Add to download queue" button
    2. The short will fail to be added to the queue and all videos in the /videos/ section of the channel will be added to the queue.

    Expected behavior

    Two parts to this.

    1. When adding a short URL, that specific video should be downloaded. Adding the following if statement to line 194 of helper.py resolves this particular issue and treats URLs with /short/ as videos. I don't really know any python, so this may be a subpar fix.
            if parsed.path.startswith("/shorts/"):
                youtube_id = parsed.path.split("/")[2]
                _ = self.find_valid_id(youtube_id)
                return youtube_id, "video"
    

    I believe that it might be better to not default to channel and instead throw an error if the URL does not match channel, video, playlist, shorts, streams, etc.

    1. There does not appear to be any provision to automatically download shorts and streams when subscribing to a channel or adding a channel to the download queue. This would be a a nice feature to have, though I suspect it might not be a good default feature and should probably be configurable. I think something similar to this request is outlined in https://github.com/tubearchivist/tubearchivist/issues/368.

    I think the only place that would need to be updated to add short and stream support at a really simple level is the get_last_youtube_videos function in subscriptions.py. It looks like there is only a check for the URL ending in /videos when looking up a channel ID. I think this would need to also perform a check to see if any shorts or streams exist and if so combine those together to add to the download queue.

    I made an attempt that kind of worked, though I don't know python. I'm sure this is shitty code and has some unexpected problems 😭.

        def get_last_youtube_videos(self, channel_id, limit=True):
            """get a list of last videos from channel"""
            obs = {
                "skip_download": True,
                "extract_flat": True,
            }
            if limit:
                obs["playlistend"] = self.config["subscriptions"]["channel_size"]
    
            url_videos = f"https://www.youtube.com/channel/{channel_id}/videos"
            channel_videos = YtWrap(obs, self.config).extract(url_videos)
            if not channel_videos:
                channel_videos = {}
            
            url_shorts = f"https://www.youtube.com/channel/{channel_id}/shorts"
            channel_shorts = YtWrap(obs, self.config).extract(url_shorts)
            if not channel_shorts:
                channel_shorts = {}
            
            url_live = f"https://www.youtube.com/channel/{channel_id}/streams"
            channel_streams = YtWrap(obs, self.config).extract(url_live)
            if not channel_streams:
                channel_streams = {}
    
            channel = channel_videos | channel_shorts | channel_streams
            
            if not channel:
                return False
    
            last_videos = [(i["id"], i["title"]) for i in channel["entries"]]
            return last_videos
    

    Relevant log output

    Log after following steps to reproduce.
    
    processing: https://www.youtube.com/shorts/DFd9_sDBMv0
    ParseResult(scheme='https', netloc='www.youtube.com', path='/shorts/DFd9_sDBMv0', params='', query='', fragment='')
    [{'url': 'UCpd0n9H-HdK4GplD5RGvIDg', 'type': 'channel'}]
    [2022-12-10 15:04:29,492: INFO/MainProcess] Task home.tasks.extrac_dl[b3ef705b-2c4e-411e-9715-5175adb9f4cd] received
    [2022-12-10 15:04:29,870: WARNING/ForkPoolWorker-16] xCXlzhS5Sz8: add to download queue
    [2022-12-10 15:04:30,925: WARNING/ForkPoolWorker-16] wxpVWxkSS8g: add to download queue
    [2022-12-10 15:04:31,853: WARNING/ForkPoolWorker-16] lhYCpd1k3P8: add to download queue
    [2022-12-10 15:04:32,865: WARNING/ForkPoolWorker-16] jw5NWWyiW70: add to download queue
    

    Anything else?

    Apologies if this is a known issue. I tried to search for anything about shorts and not much came up. Thanks for all your work on this project. It really is useful and much appreciated.

    bug 
    opened by maltbeverage 5
Releases(v0.3.0)
  • v0.3.0(Nov 30, 2022)

    Project updates

    • The browser extension Tube Archivist Companion got a major update, basically a rewrite to v0.1.0, now injecting buttons directly into the YouTube page, making this much more user friendly.
    • Tube Archivist now takes system snapshots instead of json file backups for mapping changes. Make sure to activate snapshot first before updating, particularly for large indexes, to avoid a lengthy delay at first start after the update, wiki.
    • Your video and channel indexes will automatically get updated at startup to take the new mapping changes, and a new index to hold the comments will be created.
    • If you are setting your version for Elasticsearch manually, that would be a good time to update to 8.5.1, if you are using archivist-es, you'll get the update automatically to take advantage of some improvements there.

    Added

    • Added comments archiving, wiki
    • API: Added endpoints for comments management, docs
    • Added tag cloud to video page, wiki
    • Added similar videos on video page, wiki
    • API: Added endpoint for similar videos, docs
    • Added startup cleanup function deleting leftover partial video files from cache/downloads
    • Added podman installation instructions, by @redxtech, wiki

    Changed

    • Changed mapping or settings update in index now triggers a snapshot instead of a json file backup, if enabled
    • Changed video page template to better integrate the inline player for similar videos
    • Changed inconsistent thumbnail path building from API template
    • Changed wording for scheduler frequency, #358
    • Changed potential pitfalls section to common errors, readme

    Fixed

    • Fixed channel deactivation error
    • Fixed channel reindex not triggering due to a mapping error in channel_last_refresh
    • Fixed playlist deactivation error
    • Fixed error deactivating a not set configuration, #362

    Hotfix

    • Image pushed again with a fix for #372.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.4(Nov 5, 2022)

    Project updates

    This release introduces deduplicated snapshots for the Elasticsearch index. Before activating snapshots on the settings page, you’ll have to add an additional environment variable to the archivist-es container: path.repo=/usr/share/elasticsearch/data/snapshot, also see the updated docker-compose.yml file for reference.

    The plan is to replace the current json file backup solution with snapshots, as this is a much faster, ressource and storage efficient solution. But as the snapshot files are not human readable, the current json backup solution will stay as a manual backup solution.

    It’s particularly recommended to activate snapshots for large indexes, as upcoming changes in the index will otherwise trigger a slow json backup task.

    Added

    • Added system snapshot for your metadata index, wiki
    • API: Added endpoints for snapshot management
    • Added configuration for fuzziness in searching, wiki
    • Added more detailed installation instructions, by @bakkot, Readme
    • Added more detailed contributions steps to setup your dev environment, by @bakkot, link
    • Added keyboard shortcuts for player, by @bakkot, wiki
    • Added LDAP attribute mapping, by @BrianCArnold, Readme

    Changed

    • Changed arm64 build to use ffmpeg build from yt-dlp for better compatibility

    Fixed

    • Fixed issue with yt-dlp api change not getting playlists videos anymore
    • Fixed issue where playlist was missing channel metadata
    • Fixed mobile layout overflow on downloads page
    • Fixed form validation to not allow channel page size of 0, #334
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Oct 23, 2022)

    Added

    • Added channel query filter for downloads page, wiki
    • Added downloads link for channel pages, wiki
    • Added documentation for minimal system requirements, Readme

    Changed

    • Changed sponsorblock API requests, better timeout and rate limiting handling
    • Changed internals how URL queries get parsed for future improvements

    Fixed

    • Fixed is_live key error after yt-dlp API change, #336
    • Fixed thumbnail parsing error for malformed images, #325
    • Fixed missing watched_date bulk update for channels and playlists, #309
    • Fixed Chrome compatibility issue with description text reveal, #327
    • Fixed error handling in playlist thumbnail extraction
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Sep 19, 2022)

    Added

    • Added LDAP disable cert check, by @DanielBatteryStapler
    • Added configurable grid size for downloads page
    • Added additional docker logs for manual import and add to queue

    Changed

    • Changed downloads video UI to integrate in regular video list and grid classes

    Fixed

    • Fixed manual import cleanup metadata, #311
    • Fixed manual import file extension split error, #311
    • Fixed channel creation from video metadata to catch all generic errors, #312
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Aug 20, 2022)

    Project updates

    • You can now sponsor this project here directly on GitHub. Show your support by donating to this project!

    Added

    • Offline media import from info.json file, as described in the wiki.
    • LDAP authentication support, by @DanielBatteryStapler, readme.

    Changed

    • Changed and refactored thumbnail downloader, better performance for large indexes, better integration into existing classes.
    • Changed download form placeholder wording to better represent what you can enter there, #300.
    • API: Changed search form to dedicated API endpoint, by @PrivateGER.

    Fixed

    • Fixed CSRF error for SSL reverse proxies, by @birdwing.
    • Fixed channel url redirect for old channel names, #276.
    • Fixed backup lock to prevent multiple tasks from running, #278.
    • Fixed error handling for thumbnail downloader, #228.
    • Fixed error handling on RYD api downtime, #283.
    • Fixed subtitle parser for empty subtitles, #288.
    • Fixed vertical positioning of thumbnail for full text search.

    Thank you to everybody contributing to this project!

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jul 23, 2022)

    Breaking Changes

    • To validate from where the interface can be accessed, a new environment variable is required. Set TA_HOST to your hostname or IP, Link.
    • Tube Archivist now depends on Elasticsearch version 8, Link.
      • For peace of mind, make a manual backup before starting the update.
      • If you are using bbilly1/tubearchivist-es you will automatically get the recommended and tested version, else set your tag to 8.3.2 when using official Elasticsearch.
      • If you have been using the recommended version 7.17 before, Elasticsearch will take care of the internal upgrade automatically.
      • This will break backwards compatibility, you won’t be able to downgrade to ES7 - at least not easily.
      • Be patient, the migration can take a few minutes.

    Added

    • Added validation from where this application can be served with the TA_HOST environment variable, improving security.
    • Added authentication for all static user generated files such as thumbnails and media files:
      • This might have a hopefully neglectable performance impact when using a big archive page size.
      • But these improved validations gives the confidence to remove the security notification from the known limitations.
    • Added keyword based search and filter for all your queries.
    • Added full text search over all your indexed subtitles.
      • As documented in the wiki.

    Changed

    • Changed the channel detail page into multiple subpages, based on the mockup by @pairofcrocs
      • Video page: Showing all videos from this channel
      • Playlist page: Showing all playlists from this channel
      • About page: Additional metadata and channel configuration form
      • As documented in the wiki.
    • Changed internal media file move functions shutil.move for better compatibility for some platforms, #268 by @p0358
    • Changed descriptions to show a few lines as preview, #269 by @p0358
    • Changed backup task to write page by page for better performance for big indexes but slightly slower for small indexes.
    • Changed backup zip file to only include relevant json files to restore for better performance for big indexes.
    • Changed video download cache naming structure, fixing filename sanitation issue.
    • Changed download progress message to use full video title instead of filename, #271

    Fixed

    • Fixed reindex task, deactivating non existing channels
    • Fixed webkit fullscreen scaling issue, #264 by @samdoshi
    • Fixed cookie import validator from browser extension, #266
    • Fixed nginx user permission error for some platforms, #268 by @p0358
    Source code(tar.gz)
    Source code(zip)
  • v0.1.7(Jul 3, 2022)

    Project update

    • Tube Archivist Companion browser extension v0.0.3 now supports cookie sync

    Added

    • Added Periodical cookie validation
    • API: Added task GET view to return running tasks, by @lamusmaser
    • API: Added route to store cookie with POST request for browser extension

    Changed

    • Changed skip superuser creation with lockfile at startup, by @dshoreman
    • Changed cookie import to don't load if validation fails
    • Changed Redis connections to auto expire
    • Changed Redis message handling refactor to not auto expire

    Fixed

    • Fixed various CSS scaling issues for tablet and mobile
    Source code(tar.gz)
    Source code(zip)
  • v0.1.6(Jun 4, 2022)

    Project updates

    • First release build on new build server, thanks to everybody contributing financially
    • Tube Archivist Companion Browser Extension update v0.0.2

    Added

    • User configurable grid row size for videos
    • Added Truenas Scale instructions, wiki shout out to: @heavybullets8

    Changed

    • Changed cookie file handling directly from redis instead of file, yt-dlp 2022.05.18
    • Changed use embedded metadata for videos with content id, #241, shout out to @anonamouslyginger
    • Changed subtitle naming convention to .lang.vtt for new downloads, #195
    • Changed search as you type delay for better performance
    • Changed delete download queue buttons, moved to settings page under Actions header
    • Refactored yt-dlp integration into reusable base class
    • General code cleanup of unused methods

    Fixed

    • Fixed deleted video lingering in playlist metadata
    • Fixed subtitle parsing error without segments, #249
    • Fixed process truncated thumbnail images, #256
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(May 8, 2022)

    Project updates

    Added

    • Added cookie import, wiki
    • API: added pagination for list views
    • API: added sort and query filter in download view
    • API: added run task view

    Changed

    • API: handle 404 in list views

    Fixed

    • Fixed arm64 build error, #234 #240
    • Fixed holding on to previous Sponsorblock timestamps, shout out to @n8detar
    • Fixed channel validation error when subscribing to playlist, #223
    • Fixed error for thumbnail re-embedding task, #231
    • Fixed autodelete error creating malformed requests to ES, #217
    • Fixed reindex error when channel name has changed on YT, #211
    • Fixed premium trailer videos video id mismatch, #237
    • Fixed timeout issue with yt-dlp check-format interrupting the UI
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Apr 16, 2022)

    Project updates

    • Tube Archivist has a new home: https://github.com/tubearchivist/tubearchivist
    • There is a minimal Browser Extension, Firefox is approved, Chrome is still pending, see installation instructions for manual install.
    • There is a now bbilly1/tubearchivist-es, a set and forget Elasticsearch docker image, that automatically updates with Tube Archivist to the recommended version, Readme, recommended for Unraid due to a limitation of how the version numbers are parsed, optional for everybody else.
    • There is a WIP Tube Archivist Metrics container to provide Tube Archivist metrics in Prometheus/OpenMetrics format, shout-out to @ainsey11 for working on that
    • While developing the API, we are rewriting the frontend in NextJS/React, join us on Discord if you want to help. Shout-out to @insuusvenerati for taking the initiative.
    • There is an unfortunate unfixed bug in the periodic refresh task #211, requiring you to manually rename the channel folder if the name on YouTube has changed since. Check your logs for error messages.

    Added

    • Added SponsorBlock integration support, wiki and wiki, shout-out to @n8detar for implementing the skipping in the player
    • Added API endpoint for login
    • Added API endpoint for video lists
    • Added API endpoint to test connectivity
    • Added detailed Installation instructions to wiki, shout-out to @pairofcrocs

    Changed

    • Changed Dockerfile structure, reduced image size, faster build, better caching, shout-out to @Lickitysplitted for the help

    Fixed

    • Fixed nginx default conf location conflict, shout-out to @Lickitysplitted
    • Fixed timing issue with download progress message, #210 shout-out to @ainsey11
    • Fixed schedule input validator, #209
    • Fixed subtitle parsing error, resulting in failed download, #196
    • Fixed startup error message for unsupported ES version, #197
    • Fixed pagination link building error, #221

    Final notes

    Thank you for every contribution, reach out if you want to get involved too!

    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Mar 26, 2022)

    Notes

    • This release will automatically rebuild your video and channel index indexes
    • This release will validate a minimal Elasticsearch version of 7.17. This will be required for an upgrade to Elasticsearch 8 in a future release, 7.17 allows for a smooth and automatic upgrade between the major releases.

    Added

    • Added dedicated continue watching section on top of homepage if you have any in progress videos
    • Added limited options for per channel customization wiki, with potential for future expansion:
      • Download format
      • Delete watched videos after x days
      • Index Playlist
    • Added startup check to validate minimal and maximal supported Elasticsearch version

    Changed

    • Changed how subtitles are indexed to reduced overhead by joining multiple lines in one document
    • Changed how the download queue is indexed and build for better extensibility
    • Changed index playlist is now part of per channel settings
    • Improved deploy.sh to now run into testing environment without previous configurations
    • Improved deploy.sh to install debug tools in testing environment

    Fixed

    • Fixed ignore progress bar if video is watched
    • Fixed how auto generated subtitles are parsed, #180
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Feb 26, 2022)

    Added

    • Added storing playback position and continue watching from where you left off, shout out to @n8detar
    • Added watch progress bar indicator over video thumbnail

    Changed

    • Changed API endpoints to return config key by default

    Fixed

    • Fixed a bug where subscribed playlist would auto unsubscribe when a new video was added.
      • You might want to double check your playlist subscriptions and re-subscribe if needed
    • Fixed rescan error if the channel doesn't exist anymore, #175
    • Fixed build error and better ffmpeg URL extractor from GitHub release API
    • Fixed some more small bugs, so we can create more later.

    Note

    • This broke compatibility between unstable builds and requires a reset of Redis by deleting dump.rdb in the /data volume of Redis to reset your user configurations. Regular installations with the latest tag are not affected.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Feb 13, 2022)

    Note

    • This update will automatically recreate and change the ta_video and ta_channel index to pick up the new mappings
    • Additionally a new index ta_subtitle will get created to index subtitles

    Added

    • Added subtitle download, display and index support, wiki
      • searchinging subtitles is pending
    • Added Google Cast support, shout out to @n8detar
    • Added a new wiki page: FAQ
    • Added backfill functionality for videos to get missing returnyoutubedislike.com ratings, if integration is enabled
    • Added additional fields to the channel metadata indexing for future use
    • Added a hint of what to do when there are no videos yet, shout out to @SteVwonder
    • Added link to Helm Chart, shout out to @insuusvenerati
    • Added a few barely useful API endpoints, link
    • Added browser extension proof of concept, link

    Changed

    • Changed JS player: Lots of improvements on the integrated video player, shout out to @n8detar
      • Show more metadata: likes, dislikes, views
      • Auto mark video as watched at 90%
    • Changed toggle: UI improvements inverting toggle to indicate current state, shout out to @GigaFyde
    • Major refactor and reorganization of all python code for reusability and readability improvements

    Fixed

    • Fixed last page error for more than 10k video pagination, #156
    • Fixed issues with non ASCII character channel name, #127, #146
      • If you were affected by this bug, delete the channels then future downloads should work fine.
    • Fixed edge case where thumbnail embed failed, added atomicparsley to the image, #155
    • Fixed previous workaround with django debug variable, #159

    Thank You

    Thank you to everybody who is contributing to the improvement of this project! Join us on our Discord.

    Help needed

    There is a proof of concept browser extension that is waiting for you to improve on, yes you! :-) Join us on Discord.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jan 8, 2022)

    Connect

    • We now have a brand new discord server, join us here!
    • Join us in our brand new dedicated subreddit: r/TubeArchivist

    Note

    • This update will automatically recreate the indexes to allow for better search functionality. As always, this will automatically create a backup first and can take up to a few minutes.

    Added

    • Dedicated search page for search as you type over the whole index to dynamically get search results for videos, channels and playlists.
    • English language analyzer for improved search matching fixing stemming, plural/singular and some other
    • Optional integration with returnyoutubedislike.com to get dislikes and average ratings back

    Changed

    • All django views have been refactored, shout out to @pawwel thank you for your help! #115, #116
    • All search functionality is now consolidated on the dedicated search page, this will replace the previous per page search forms
    • The sort order functionality is now integrated in the view style switch area, making things more compact
    • The subscribe to channel and subscribe to playlist forms are restyled in a more compact format
    • Wiki pages got a refresh…

    Fixed

    • Fixed lot's of minor UI issues...
    • Fixed auto delete error when there was nothing to auto delete, #122
    • Fixed remaining orphaned playlists of delete channels, #118
    Source code(tar.gz)
    Source code(zip)
  • v0.0.9(Dec 17, 2021)

    Manual changes

    This release requires you to set your timezone environment variable TZ for TubeArchivist, otherwise the fancy brand new scheduler won’t know what time it is.

    To note

    • Due to a mapping change this will automatically recreate the ta_playlist index in Elasticsearch on startup. As always an automatic backup of the index will be created first.
    • There is now an unstable release tag published to docker hub that will get updated between the different releases to quickly pull images to look at changes. As the name implies this is unstable WIP and only for your testing environment, more under Contributing.
    • Even though reasonably uptodate Elasticsearch images were never vulnerable to the log4j vulnerability, this might be a good opportunity to update to latest 7.16.1.

    Added

    • Added cron like scheduler support to automatically:
      • Rescan Subscriptions
      • Start download
      • Refresh Metadata
      • Thumbnail check
      • Index backup
    • Scheduler configurations and examples are in the wiki.
    • Added optional auto delete of watched videos, #56
    • Added remember me to define your session’s lifetime, #77
    • Added login page autofocus to user name form field, #104
    • Added port overwrite environment variables for nginx and uwsgi to deal with otherwise unresolvable port collisions, #103, read me
    • The favicons, the whole favicons and nothing but the favicons, #93
    • Dynamic copyright for footer, #107

    Changed

    • Rewrite of the notifications functionality into separate message channels to have notifications for different topics on different pages for better and extendable UI feedback.
    • Changed the video player to theater mode to use more of the available space, #98, #95
    • Changed the restore backup view on the settings page to use tagged backup name to separate between automatically, manually and due to update created backup files, wiki
    • Changed the subscribe and unsubscribe button to single color coded toggle, #62

    Fixed

    • Fixed extractor error for old playlist format, #94
    • Fixed empty playlist index rescan error, #101
    • Fixed missing average rating due to disabled dislike button, #109

    Thank you

    As always, thank you for everybody opening issues and contributing to the improvement of this project!

    Source code(tar.gz)
    Source code(zip)
  • v0.0.8(Nov 27, 2021)

    Index update

    This update will make changes to the Elasticsearch index storing the videos ta_video and will create a new index called ta_playlist that will contain the playlists. This should be automatic at startup and even for big archives shouldn’t take more than 1 min. Tube Archivist will automatically make a backup of the index before starting that process.

    Added

    • Playlist support: Subscribe to playlists, add videos to the queue with "Rescan subscriptions" button from downloads page, described here
    • Playlist support: Find and index playlists from selected channels from the channel detail page, described here
    • Playlist navigation at the bottom of the video page for videos part of a playlist.
    • Added original youtube link to video, channel detail and playlist detail page if the link is still available, #81
    • Added subscribe button directly to the channel and playlist, #81
    • Added delete download queue button to the downloads page, #85
    • Added a note about disk usage on the readme, #91

    Changed

    • Changed thumbnail extraction with --check-format to make sure the thumbnail trying to download is available, #83
    • Changed format selection with using --check-format option to make sure the stream selected by yt-dlp is actually available, #90
      • Sadly both these changes will slow down adding videos to the download queue but will improve reliability...
    • Refactoring and performance improvements for index scanning.

    Fixed

    • Fixing very silly youtu.be extractor error, #40
    • Fixing adding KB/s unit back to settings page, #87
    • Fixing indexing and downloading multi feed videos.

    Looking for feedback

    As always, thank you for everybody opening issues and helping to improve this project. If you have any feedback to playlists in particular, there is a discussion thread going.

    Source code(tar.gz)
    Source code(zip)
  • v0.0.7(Nov 1, 2021)

    Breaking Changes

    There are several breaking changes in this version:

    First take down all containers.

    1. TubeArchivist requires additional environment variables:
    • Authentication for TubeArchivist: TA_USERNAME and TA_PASSWORD
    • Authentication for Elasticsearch: ELASTIC_PASSWORD

    Elasticsearch requires these changes:

    • Enable security by adding: xpack.security.enabled=true
    • And the matchingELASTIC_PASSWORD.

    Take a look at the updated docker-compose.yml file, use a better password than verysecret.

    1. Naming of Redis values are now standardized to allow for per user configurations. This means all your previous configurations on the settings page will fall back to the default values. So most importantly make sure to change the Download Format options to your preference before continue to download.

    To avoid having unused values set in Redis it is recommended to delete the file dump.rdb from the redis volume.

    1. Not required but recommended, change the port settings for archivist-redis and archivist-es to expose, these ports don’t need to be accessible over the network, this is also changed in the updated docker-compose file.

    Added

    • User authentication with limited multi user support:
      • Each user can have different interface settings
      • For now all users share the same videos and permissions...
      • Check out the Users section of the wiki for more details
    • Extending sort by options and add asc/desc switch on home page
    • Added same sorting and filtering options to the channel page as well
    • Implemented --throttled-rate option of yt-dlp
    • Re embed thumbnails into media file after downloading
    • Channel names are now supported and will get automatically translated to the correct channel ID, #40

    Changed

    • Making HOST_UID and HOST_GID optional for NFS compatibility, #58
    • Better progress information for adding to queue and rescanning functions
    • Calls to elasticsearch are now authenticated with credentials set with environment variables
    • Input forms are now validated before processing, increasing security
    • Redis keys are now name and user spaced, hence the breaking change

    Fixed

    • Fix iOS compatibility issues with format example, #61
    • Lots of additional bug fixes and improvements, #28 #60 #64 #73 #75

    Thank you for everybody opening issues and helping to improve Tube Archivist!

    Source code(tar.gz)
    Source code(zip)
  • v0.0.6(Oct 17, 2021)

    Added

    • Embed thumbnail into media file postprocessor
    • Rescan filesystem to clean up index
    • Delete video button
    • Delete channel button

    Changed

    • Rewrite of the artwork extraction and downloading classes, see below for more details
    • Showing default artwork when none is available do avoid breaking the interface.
    • Average video rating now shows as nostalgic stars.
    • The watched/unwatched checkbox is now a toggle, so you can revert the change back.

    Fixed

    • New channel media folders will now get created with the correct permissions same as media files
    • Fixing an issue where a previously failed download task wouldn't clean up after it self

    New architecture support

    Additional installation instructions in the readme for:

    • arm64: Untested, looking for feedback, shout out to @lamusmaser
    • Unraid, shout out to @pairofcrocs
    • Synology, shout out to @geekedtv

    Update path

    The new thumbnail caching method is not backwards compatible. After updating, your already downloaded thumbnails will get reorganized into subfolders. Then Tube Archivist will scan your library and download all missing thumbnails. This can take a long time depending on your library size. docker-compose logs -f tubearchivist will confirm that something is happening. Then from this version on, new artwork will get downloaded once you add a video to the download queue instead of on demand when the interface needs it. This has a few key advantages:

    • Future proof the cache/video folder to not hold potentially 10s of thousands video thumbnails in one single folder.
    • Speed up the interface because all of the artwork will already be cached upfront.
    • Speed up the downloads view by using the cached thumbnails instead of loading them from youtube.
    • Guarantee that there is artwork available even if the video disappears later.
    • More in the spirit of the Archivist to make sure all relevant information is safely stored and organized.
    • And speed up searching with artwork preview in an upcoming version...

    Clean up

    Due to not handling 404 errors in thumbnails extraction before, you might have ended up with some placeholder thumbnails from youtube looking like this or with a html error file for channel art work. That's not really a problem but if you want to replace them with a beautiful Tube Archivist styled placeholder instead, shut down the container and continue:

    Running this command from the cache/video folder on your host system will show all failed video thumbnails:

    find . -type f -exec md5sum {} \; | grep 2f5b1b159ee4893e015e1c373111919b
    

    If that doesn't give any output, you are golden, else this command will delete all thumbnails matching that specific hash:

    find . -type f -exec md5sum {} \; | grep 2f5b1b159ee4893e015e1c373111919b | awk '{$1 = "rm" ; print }' | bash
    

    Similar for the channel art work, 404 errors resulted in downloading a html page with the content of just that. To find all html files run this from the cache/channel folder:

    find . -print | file -if - | grep "text/html" | awk -F: '{print $1}'
    

    Again, if you don't get any output, you are good. If you do see any files matching, don't get confused with the file ending, these aren't actually JPGs, run this command to delete these files:

    find . -print | file -if - | grep "text/html" | awk -F: '{print $1}' | xargs rm
    

    Sorry for the complications....

    Source code(tar.gz)
    Source code(zip)
  • v0.0.5(Oct 3, 2021)

    Added

    • Added grid and list view switch for all archive pages
    • The Downloads view now has a ignored view list toggle to show all previously ignored videos and provides options to unignore them
    • The Github wiki is now where all the user documentation is located.
    • Tube Archivist can now integrate with a custom RedisJSON port.
    • Added a section in the contributing page about how to set up your testing environment.

    Changed

    • Tube Archivist now utilizes the patched nightly builds of ffmpeg for best compatibility with yt-dlp, #37 #26
    • The About page contains now just useful links as the documentation is now consolidated into the github wiki.
    • Converted some true/false dropdowns to a toggle switch.
    • The “Download Queue” button on the download page is now called “Start Download” for better clarity.

    Fixed

    • Cleaned up startup functions into a dedicated Django ready method to avoid double execution.
    • There is still a lot of refactoring and cleaning up going on.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.4(Sep 26, 2021)

    Added

    • Added a readme section about updating Tube Archivist and expected future changes.
    • Added some donating links, #29
    • Added a docs folder to start working on a Tube Archivist wiki. Asking for help @TechnicallyOffbeat

    Changed

    • Changed how the download queue works: Is now dynamic to allow for gracefully stopping and ungracefully killing the process.
    • Default download limit value is now disabled on new installations, as the dynamic queue offers better ways to stop the download process.
    • “Download now” function allows to set the video as a priority download infront of an already running queue.
    • Additionally the download order is more logical now: New videos get added to the back of the queue, videos start downloading from the top of the queue.

    Fixed

    • Better error handling in add to download form
    • Sanitizing directory scan output from hidden and temporary files, #30
    Source code(tar.gz)
    Source code(zip)
  • v0.0.3(Sep 22, 2021)

    Added

    • Added support to restore index from backup zip file
    • Post-processors support for yt-dlp and optional embedding of metadata into media file, #21 shout out to @nifoc
    • Now showing current version number in the footer for easy reference
    • There is now a CONTRIBUTING.md file
    • Linting and code formatting rules, shout out to @cclauss

    Changed

    • Lots and lots of improvements, refactoring, cleaning in the code base to make things presentable

    Fixed

    • Fixed lots of grammar and spelling issues, shout out to @TechnicallyOffbeat
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Sep 17, 2021)

    Added

    • backup metadata db to disk
    • readme section about elastic search permission error

    Changed

    • Download view now has pagination to avoid loading too many thumbnails at once

    Fixed

    • now publishing same image to docker for latest and newest semantic version
    • fixed blocking issue with download now
    • fixed staticfile collection throwing an error on container restart because files are already there
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Sep 15, 2021)

    Added

    • Importing and indexing existing video collection into archive.

    Changed

    • Subscribe to channel now takes a list of channels

    Fixed

    • Fixed scraping issue with EU cookie consent screen #2
    • Fixed issue where subscribing to invalid channel ids froze interface
    Source code(tar.gz)
    Source code(zip)
Owner
Simon
Free and open source software enthusiast.
Simon
Youtube-dislikes-adder - Add dislikes to the description of your YouTube videos.

Add number of dislikes to the description of your YouTube videos. Number of dislikes are updated if you let this function as a bot.

fluks 1 Aug 23, 2022
Video Chat Streamer With Python

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 09, 2022
Converts Betaflight blackbox gyro to MP4 GoPro Meta data so it can be used with ReelSteady GO

Here are a bunch of scripts that I created some time ago as a proof of concept that Betaflight blackbox gyro data can be converted to GoPro Metadata F

108 Oct 05, 2022
This is an example of building a video Question-Answer system using Jina.

example-video-search This is an example of building a video Question-Answer system using Jina. The index data is subtitle files of YouTube videos. Aft

Jina AI 9 Oct 18, 2022
VIT - VideoInTerminal. A quick piece of code to play videos in your terminal using python

VIT VIT - VideoInTerminal. A quick piece of code to play videos in your terminal using python.

ShellTear 3 Mar 03, 2022
Python application that can be used to generate video thumbnail for mp4 and mkv file types.

Thumbnail Generator 🎬 What is This This is a Python application that can be used to generate video thumbnail for mp4 and mkv file types. Installation

Tharindu N. 13 Jan 03, 2023
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

RicanSamurai 542 Dec 03, 2022
video streaming userbot (vsu) based on pytgcalls for streaming video trought the telegram video chat group.

VIDEO STREAM USERBOT ✨ an another telegram userbot for streaming video trought the telegram video chat. Environmental Variables 📌 API_ID : Get this v

levina 6 Oct 17, 2021
TkVideoplayer - This is a simple library to play video files in tkinter.

TkVideoplayer - This is a simple library to play video files in tkinter.

Art/Paul 38 Dec 23, 2022
Komposition - The video editor built for screencasters

Komposition The video editor built for screencasters Tutorial Video | Introduction | Installation Documentation See the documentation and user guide.

Oskar Wickström 428 Jan 08, 2023
Cross-platform command-line AV1 / VP9 / HEVC / H264 encoding framework with per scene quality encoding

Av1an A cross-platform framework to streamline encoding Easy, Fast, Efficient and Feature Rich An easy way to start using AV1 / HEVC / H264 / VP9 / VP

Zen 947 Jan 01, 2023
This program is to make a video based on Deep Dream

This program is to make a video based on Deep Dream. The program is modified from DeepDreamAnim and DeepDreamVideo with additional functions for bleding two frames based on the optical flows. It also

Aertist 23 Jan 22, 2022
Meteor scan - Scan through video for meteor

meteor_scan Scan through video for meteor Installation Install python packages b

2 Jun 04, 2022
Python script for extracting audio from video files and creating Mel spectrograms

video2spectrogram About This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these a

Alexandros Stergiou 1 Oct 28, 2021
A Python extension that provides bindings to WebRTC M92

This project follows the W3C specification with some modifications and additions to make it work better with Python applications, with useful APIs like programmatic audio and video.

Il'ya 104 Dec 26, 2022
This is a tool for making a every day video if you take a picture of you everyday

Face-Everyday-Maker-Studio Description This project is a tool for making a everyday video, which is timelapse video or slides video, of images but for

John A Betancourt G 9 Sep 06, 2022
A platform which give you info about the newest video on a channel

youtube A platform which give you info about the newest video on a channel. This uses web scraping, a better implementation will be to use the API. BR

Custom components for Home Assistant 36 Sep 29, 2022
Text2Video's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images to include in the video.

Text2Video Text2Video's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images

Josh Chen 19 Nov 22, 2022
Code from the 2021 Signal Video Superclass

Twilio Video Demo This is the code written during the live Twilio Video demo during Twilio's Signal 2021 Superclass. It creates a simple Video applica

2 Oct 21, 2021
Play Video & Music on Telegram Group Video Chat

🖤 DEMONGIRL 🖤 ʜᴇʟʟᴏ ❤️ 🇱🇰 Join us ᴠɪᴅᴇᴏ sᴛʀᴇᴀᴍ ɪs ᴀɴ ᴀᴅᴠᴀɴᴄᴇᴅ ᴛᴇʟᴇʀᴀᴍ ʙᴏᴛ ᴛʜᴀᴛ's ᴀʟʟᴏᴡ ʏᴏᴜ ᴛᴏ ᴘʟᴀʏ ᴠɪᴅᴇᴏ & ᴍᴜsɪᴄ ᴏɴ ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴠɪᴅᴇᴏ ᴄʜᴀᴛ 🧪 ɢ

Jonathan 5 Dec 31, 2021