google-cloud-bigtable Apache-2google-cloud-bigtable (🥈31 · ⭐ 3.5K) - Google Cloud Bigtable API client library. Apache-2

Overview

Python Client for Google Cloud Bigtable

GA pypi versions

Google Cloud Bigtable is Google's NoSQL Big Data database service. It's the same database that powers many core Google services, including Search, Analytics, Maps, and Gmail.

Quick Start

In order to use this library, you first need to go through the following steps:

  1. Select or create a Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Cloud Bigtable API.
  4. Setup Authentication.

Installation

Install this library in a virtualenv using pip. virtualenv is a tool to create isolated Python environments. The basic problem it addresses is one of dependencies and versions, and indirectly permissions.

With virtualenv, it's possible to install this library without needing system install permissions, and without clashing with the installed system dependencies.

Supported Python Versions

Python >= 3.5

Deprecated Python Versions

Python == 2.7. Python 2.7 support will be removed on January 1, 2020.

Mac/Linux

pip install virtualenv
virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install google-cloud-bigtable

Windows

pip install virtualenv
virtualenv <your-env>
<your-env>\Scripts\activate
<your-env>\Scripts\pip.exe install google-cloud-bigtable

Next Steps

google-cloud-happybase

In addition to the core google-cloud-bigtable, we provide a google-cloud-happybase library with the same interface as the popular HappyBase library. Unlike HappyBase, google-cloud-happybase uses google-cloud-bigtable under the covers, rather than Apache HBase.

Comments
  • Bigtable python client - memory leak in every requests

    Bigtable python client - memory leak in every requests

    Environment details

    1. Google cloud bigtable
    2. Ubuntu 18.04
    3. python 3.7
    4. google-cloud-bigtable==1.6.1

    The issue

    We have seen some memory leak in our python program that uses google-cloud-bigtable. It grows slower, but after a few days, our program consume all the memory of our VM. After investigating, we saw the memory leak occurs every time we make a bigtable request (read or write). We saw some topics about Memory leak when people re create client every time they make a request. The solution offered was always to close the gRPC channel after each request or (better) using a single client. ie:

    https://stackoverflow.com/questions/54371554/bigtable-grpc-memory-leak-how-to-resolve https://github.com/googleapis/google-cloud-python/issues/7207

    In our case we already use a single client but still have a memory leak.

    Code example

    To illustrate the issue, we made 2 tests and monitored the memory growth in 2 cases. 1/ With a single client we make many read requests and monitor the memory 2/ We still make many read requests but each time we create a new client and at the end we explicitly close the grpc channel.

    In both case, we still see that the memory still growing (memory leak)

    Test case 1

        def run_test(self, instance_id, json_credentials, table_id, row_key):
            client = Client.from_service_account_json(json_credentials)
            instance = Instance(instance_id, client)
    
            table = Table(table_id, instance)
            for i in range(2000):
                partial_row = table.read_row(row_key)
                sleep(0.05)
    

    image

    Test case 2

        def run_test_renew_client(self, instance_id, json_credentials, table_id, row_key):
            for i in range(2000):
                client = Client.from_service_account_json(json_credentials)
                instance = Instance(instance_id, client)
    
                table = Table(table_id, instance)
                partial_row = table.read_row(row_key)
    
                client.table_data_client.transport.channel.close()
                sleep(0.05)
    

    image

    As you can see the memory grows slowly but still growing.

    Thanks in advance!

    api: bigtable needs more info status: investigating type: question 
    opened by tojobac 16
  • Bigtable: plumb 'transport's through from 'Client' to GAPIC generated clients

    Bigtable: plumb 'transport's through from 'Client' to GAPIC generated clients

    /cc @sduskis

    Two issues:

    • bigtable.client.Client.__init__ takes a channel argument, but does nothing with it: the _channel attribute is never used.

    • The current GAPIC-generated clients still accept a channel argument, but are deprecating it in favor of the transport argument, where the transport is a generated wrapper around the channel, providing the method stubs needed to support the API surface.

    We should deprecate the channel argument to bigtable.client.Client.__init__, and add a new transport argument. If passed, the transport argument should be forwarded through to the three GAPIC-generated clients. If channel is passed, we should forward it through as well, until the GAPIC-generated code no longer supports it.

    api: bigtable type: feature request priority: p1 
    opened by tseaver 9
  • Synthesis failed for python-bigtable

    Synthesis failed for python-bigtable

    Hello! Autosynth couldn't regenerate python-bigtable. :broken_heart:

    Please investigate and fix this issue within 5 business days. While it remains broken, this library cannot be updated with changes to the python-bigtable API, and the library grows stale.

    See https://github.com/googleapis/synthtool/blob/master/autosynth/TroubleShooting.md for trouble shooting tips.

    Here's the output from running synth.py:

    definition of repository 'com_google_protoc_java_resource_names_plugin' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:234:1
    DEBUG: Rule 'protoc_docs_plugin' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "33b387245455775e0de45869c7355cc5a9e98b396a6fc43b02812a63b75fee20"
    DEBUG: Call stack for the definition of repository 'protoc_docs_plugin' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:258:1
    DEBUG: Rule 'rules_python' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "48f7e716f4098b85296ad93f5a133baf712968c13fbc2fdf3a6136158fe86eac"
    DEBUG: Call stack for the definition of repository 'rules_python' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:42:1
    DEBUG: Rule 'gapic_generator_python' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "fe995def6873fcbdc2a8764ef4bce96eb971a9d1950fe9db9be442f3c64fb3b6"
    DEBUG: Call stack for the definition of repository 'gapic_generator_python' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:278:1
    DEBUG: Rule 'com_googleapis_gapic_generator_go' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "c0d0efba86429cee5e52baf838165b0ed7cafae1748d025abec109d25e006628"
    DEBUG: Call stack for the definition of repository 'com_googleapis_gapic_generator_go' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:300:1
    DEBUG: Rule 'gapic_generator_php' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "3dffc5c34a5f35666843df04b42d6ce1c545b992f9c093a777ec40833b548d86"
    DEBUG: Call stack for the definition of repository 'gapic_generator_php' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:364:1
    DEBUG: Rule 'gapic_generator_csharp' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "4db430cfb9293e4521ec8e8138f8095faf035d8e752cf332d227710d749939eb"
    DEBUG: Call stack for the definition of repository 'gapic_generator_csharp' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:386:1
    DEBUG: Rule 'gapic_generator_ruby' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "a14ec475388542f2ea70d16d75579065758acc4b99fdd6d59463d54e1a9e4499"
    DEBUG: Call stack for the definition of repository 'gapic_generator_ruby' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:400:1
    DEBUG: /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/rules_python/python/pip.bzl:61:5: DEPRECATED: the pip_repositories rule has been replaced with pip_install, please see rules_python 0.1 release notes
    DEBUG: Rule 'bazel_skylib' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "1dde365491125a3db70731e25658dfdd3bc5dbdfd11b840b3e987ecf043c7ca0"
    DEBUG: Call stack for the definition of repository 'bazel_skylib' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:35:1
    Analyzing: target //google/bigtable/v2:bigtable-v2-py (1 packages loaded, 0 targets configured)
    INFO: Call stack for the definition of repository 'zlib' which is a http_archive (rule definition at /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/bazel_tools/tools/build_defs/repo/http.bzl:296:16):
     - <builtin>
     - /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/com_google_protobuf/protobuf_deps.bzl:19:9
     - /home/kbuilder/.cache/synthtool/googleapis/WORKSPACE:57:1
    ERROR: /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/upb/bazel/upb_proto_library.bzl:257:29: aspect() got unexpected keyword argument 'incompatible_use_toolchain_transition'
    ERROR: Analysis of target '//google/bigtable/v2:bigtable-v2-py' failed; build aborted: error loading package '@com_github_grpc_grpc//': in /home/kbuilder/.cache/bazel/_bazel_kbuilder/a732f932c2cbeb7e37e1543f189a2a73/external/com_github_grpc_grpc/bazel/grpc_build_system.bzl: Extension file 'bazel/upb_proto_library.bzl' has errors
    INFO: Elapsed time: 0.542s
    INFO: 0 processes.
    FAILED: Build did NOT complete successfully (3 packages loaded, 3 targets configured)
    FAILED: Build did NOT complete successfully (3 packages loaded, 3 targets configured)
    
    Traceback (most recent call last):
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/tmpfs/src/github/synthtool/synthtool/__main__.py", line 102, in <module>
        main()
      File "/tmpfs/src/github/synthtool/env/lib/python3.6/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/tmpfs/src/github/synthtool/env/lib/python3.6/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/tmpfs/src/github/synthtool/env/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/tmpfs/src/github/synthtool/env/lib/python3.6/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/tmpfs/src/github/synthtool/synthtool/__main__.py", line 94, in main
        spec.loader.exec_module(synth_module)  # type: ignore
      File "<frozen importlib._bootstrap_external>", line 678, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/home/kbuilder/.cache/synthtool/python-bigtable/synth.py", line 31, in <module>
        include_protos=True,
      File "/tmpfs/src/github/synthtool/synthtool/gcp/gapic_bazel.py", line 52, in py_library
        return self._generate_code(service, version, "python", False, **kwargs)
      File "/tmpfs/src/github/synthtool/synthtool/gcp/gapic_bazel.py", line 204, in _generate_code
        shell.run(bazel_run_args)
      File "/tmpfs/src/github/synthtool/synthtool/shell.py", line 39, in run
        raise exc
      File "/tmpfs/src/github/synthtool/synthtool/shell.py", line 33, in run
        encoding="utf-8",
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/subprocess.py", line 438, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['bazel', '--max_idle_secs=240', 'build', '//google/bigtable/v2:bigtable-v2-py']' returned non-zero exit status 1.
    2021-04-27 01:26:41,087 autosynth [ERROR] > Synthesis failed
    2021-04-27 01:26:41,087 autosynth [DEBUG] > Running: git reset --hard HEAD
    HEAD is now at be17038 chore: release 2.1.0 (#294)
    2021-04-27 01:26:41,094 autosynth [DEBUG] > Running: git checkout autosynth
    Switched to branch 'autosynth'
    2021-04-27 01:26:41,099 autosynth [DEBUG] > Running: git clean -fdx
    Removing __pycache__/
    Traceback (most recent call last):
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/tmpfs/src/github/synthtool/autosynth/synth.py", line 356, in <module>
        main()
      File "/tmpfs/src/github/synthtool/autosynth/synth.py", line 191, in main
        return _inner_main(temp_dir)
      File "/tmpfs/src/github/synthtool/autosynth/synth.py", line 336, in _inner_main
        commit_count = synthesize_loop(x, multiple_prs, change_pusher, synthesizer)
      File "/tmpfs/src/github/synthtool/autosynth/synth.py", line 68, in synthesize_loop
        has_changes = toolbox.synthesize_version_in_new_branch(synthesizer, youngest)
      File "/tmpfs/src/github/synthtool/autosynth/synth_toolbox.py", line 259, in synthesize_version_in_new_branch
        synthesizer.synthesize(synth_log_path, self.environ)
      File "/tmpfs/src/github/synthtool/autosynth/synthesizer.py", line 120, in synthesize
        synth_proc.check_returncode()  # Raise an exception.
      File "/home/kbuilder/.pyenv/versions/3.6.9/lib/python3.6/subprocess.py", line 389, in check_returncode
        self.stderr)
    subprocess.CalledProcessError: Command '['/tmpfs/src/github/synthtool/env/bin/python3', '-m', 'synthtool', '--metadata', 'synth.metadata', 'synth.py', '--']' returned non-zero exit status 1.
    
    

    Google internal developers can see the full log here.

    api: bigtable type: bug priority: p2 autosynth failure 
    opened by yoshi-automation 8
  • chore: ensure mypy passes

    chore: ensure mypy passes

    related to #447

    mypy -p google.cloud.bigtable --no-incremental
    Success: no issues found in 17 source files
    

    For bigtable, specifying a subfolder fails, but package works fine. Ignoring for now

    ❯ ./.nox/unit-3-10/bin/mypy google/cloud/bigtable
    google/cloud/bigtable/__init__.py: error: Source file found twice under different module names: "cloud.bigtable.client" and "google.cloud.bigtable.client"
    Found 1 error in 1 file (errors prevented further checking)
    
    api: bigtable cla: yes 
    opened by crwilcox 7
  • fix: address issue in establishing an emulator connection

    fix: address issue in establishing an emulator connection

    The credential and channel establishing for emulators appears to have a defect. Comparing with the recent overhaul in firestore, this should resolve those issues.

    Fixes #243 #184

    api: bigtable cla: yes 
    opened by crwilcox 7
  • Big Table Memory Leak - Read Rows / Consume All

    Big Table Memory Leak - Read Rows / Consume All

    Environment details

    1. Specify the API at the beginning of the title (for example, "BigQuery: ...") BigTable 2. OS type and version Container-Optimized OS from Google (GKE) 3. Python version and virtual environment information: python --version Python 3.6.5 4. google-cloud- version: pip show google-<service> or pip freeze google-cloud-bigtable v1.2.1

    Steps to reproduce / Bug Logical Reasoning

    When deploying the micro service within GKE, the following code example will consume the memory of a pod until it reaches the limit of allocated memory value for the pod. This behavior continues no matter the size of the memory limit.

    It has been determined that read_job.consume_all() is the source of the memory leak due to the following logic:

    The value that this function is returning is an empty dict. Therefore, it can be confirmed that no code upstream from this function is causing the memory leak. Tests have also been performed to remove the functionality and the observed memory leak was eliminated. Therefore it is conclusive that the leak is apart of code example below.

    To further pinpoint where the leak is occurring. When removing the read_job.consume_all() function, the memory issues cease to exist. Thus concluding the bug resides within the consume_all functionality.

    Code example

    Full Function

    def multi_get_by_row_keys(row_keys, bt_table=FEATURES_TABLE):
        bt_table, client = get_client_table()
        row_keys = [
            format_row_key(row_key)
            for row_key in row_keys
        ]
    
        row_set = RowSet()
        for key in row_keys:
               row_set.add_row_key(key) 
        read_job = bt_table.read_rows(row_set=row_set)
    
        row_dict = {}
        read_job.consume_all()
    
        row_dict = read_job.rows
    
        read_job.cancel()
        close_client(client)
        del read_job
        del bt_table
        del client
        del row_set
        gc.collect()
    
        return {}
    

    Perceived section causing Memory Leak

        read_job = bt_table.read_rows(row_set=row_set)
    
        row_dict = {}
        read_job.consume_all()
    
        row_dict = read_job.rows
    
        read_job.cancel()
        close_client(client)
    

    Stack trace

    No error logs occur. Pods are evicted as they consume more memory than allocated.

    api: bigtable needs more info type: bug priority: p2 
    opened by alexgordonpandera 7
  • feat: add WarmAndPing request for channel priming

    feat: add WarmAndPing request for channel priming

    • [ ] Regenerate this pull request now.

    PiperOrigin-RevId: 431126635

    Source-Link: https://github.com/googleapis/googleapis/commit/dbfbfdb38a2f891384779a7ee31deb15adba6659

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/4b4a4e7fec73c60913d8b1061f0b29affb1e2a72 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNGI0YTRlN2ZlYzczYzYwOTEzZDhiMTA2MWYwYjI5YWZmYjFlMmE3MiJ9

    chore: update copyright year to 2022 PiperOrigin-RevId: 431037888

    Source-Link: https://github.com/googleapis/googleapis/commit/b3397f5febbf21dfc69b875ddabaf76bee765058

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/510b54e1cdefd53173984df16645081308fe897e Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNTEwYjU0ZTFjZGVmZDUzMTczOTg0ZGYxNjY0NTA4MTMwOGZlODk3ZSJ9

    PiperOrigin-RevId: 430730865

    Source-Link: https://github.com/googleapis/googleapis/commit/ea5800229f73f94fd7204915a86ed09dcddf429a

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/ca893ff8af25fc7fe001de1405a517d80446ecca Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiY2E4OTNmZjhhZjI1ZmM3ZmUwMDFkZTE0MDVhNTE3ZDgwNDQ2ZWNjYSJ9

    PiperOrigin-RevId: 428795660

    Source-Link: https://github.com/googleapis/googleapis/commit/6cce671cb21e5ba9ee785dfe50f5a86b87bb5f21

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/2282bc1b081364ea783300be91a8c14cb4a718c4 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMjI4MmJjMWIwODEzNjRlYTc4MzMwMGJlOTFhOGMxNGNiNGE3MThjNCJ9

    chore: use gapic-generator-python 0.63.2

    PiperOrigin-RevId: 427792504

    Source-Link: https://github.com/googleapis/googleapis/commit/55b9e1e0b3106c850d13958352bc0751147b6b15

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/bf4e86b753f42cb0edb1fd51fbe840d7da0a1cde Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYmY0ZTg2Yjc1M2Y0MmNiMGVkYjFmZDUxZmJlODQwZDdkYTBhMWNkZSJ9

    api: bigtable owl-bot-copy 
    opened by gcf-owl-bot[bot] 6
  • feat: add Autoscaling API

    feat: add Autoscaling API

    • [ ] Regenerate this pull request now.

    PiperOrigin-RevId: 410080804

    Source-Link: https://github.com/googleapis/googleapis/commit/0fd6a324383fdd1220c9a937b2eef37f53764664

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/788247b7cbda5b05f2ac4f6c13f10ff265e183f0 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNzg4MjQ3YjdjYmRhNWIwNWYyYWM0ZjZjMTNmMTBmZjI2NWUxODNmMCJ9

    api: bigtable cla: yes owl-bot-copy 
    opened by gcf-owl-bot[bot] 6
  • feat: add 'Instance.create_time' field

    feat: add 'Instance.create_time' field

    • [x] Regenerate this pull request now.

    Committer: @gdcolella PiperOrigin-RevId: 404267819

    Source-Link: https://github.com/googleapis/googleapis/commit/324f036d9dcc21318d89172ceaba5e0fd2377271

    Source-Link: https://github.com/googleapis/googleapis-gen/commit/2fada43b275eaaadd279838baf1120bddcffc762 Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMmZhZGE0M2IyNzVlYWFhZGQyNzk4MzhiYWYxMTIwYmRkY2ZmYzc2MiJ9

    api: bigtable cla: yes owl-bot-copy 
    opened by gcf-owl-bot[bot] 6
  • docs: add sample for writing data with Beam

    docs: add sample for writing data with Beam

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)
    cla: yes 
    opened by billyjacobson 6
  • Bigtable: read_rows: no deadline causing stuck clients; no way to change it

    Bigtable: read_rows: no deadline causing stuck clients; no way to change it

    Table.read_rows does not set any deadline, so it can hang forever if the Bigtable server connection hangs. We see this happening once every week or two when running inside GCP, which causes our server to get stuck indefinitely. There appears to be no way in the API to set a deadline, even though the documentation says that the retry parameter should do this. Due to a bug, it does not.

    Details:

    We are calling Table.read_rows to read ~2 rows from BigTable. Using pyflame on a stuck process, both worker threads were waiting on Bigtable, with the stack trace below. I believe the bug is the following:

    1. Call Table.read_rows. This calls PartialRowsData, passing the retry argument which defaults to DEFAULT_RETRY_READ_ROWS. The default misleadingly sets deadline=60.0. ; It also passes read_method=self._instance._client.table_data_client.transport.read_rows to PartialRowsData, which is a method on BigtableGrpcTransport.
    2. PartialRowsData.__init__ calls read_method(); this is actually raw gRPC _UnaryStreamMultiCallable, not the gapic BigtableClient.read_rows, which AFAICS, is never called. Hence, this gRPC streaming call is started with any deadline.
    3. PartialRowsData.__iter__ calls self._read_next_response, which calls return self.retry(self._read_next, on_error=self._on_error)(). This gives the impression that retry is used, but if I understand gRPC streams correctly, I'm not sure that even makes sense. I think even if the gRPC stream return some error, calling next won't actually retry the gRPC, it will just immediately raise the same exception. To retry, I believe you need to actually restart it by calling read_rows again.
    4. If the Bigtable server now "hangs", the client hangs forever.

    Possible fix:

    Change Table.read_rows call the gapic BigtableClient.read_rows with the retry parameter., and change PartialRowsData.__init__ to take this response iterator, and not take a retry parameter at all. This would at least allow setting the gRPC streaming call deadline, although I don't think it will make retrying work (since I think the gRPC streaming client will just immediately returns an iterator without actually waiting for a response from the server?)

    I haven't actually tried implementing this to see if it works. For now, we will probably just make a raw gRPC read_rows call so we can set an appropriate timeout.

    Environment details

    OS: Linux, ContainerOS (GKE), Container is Debian9 (using distroless) Python: 3.5.3 API: google-cloud-bigtable 0.33.0

    Steps to reproduce

    This program loads the Bigtable emulator with 1000 rows, calls read_rows(retry=DEFAULT.with_deadline(5.0)), then sends SIGSTOP to pause the emulator. This SHOULD cause a DeadlineExceeded exception to be raised after 5 seconds. Instead, it hangs forever.

    1. Start the Bigtable emulator: gcloud beta emulators bigtable start
    2. Find the PID: ps ax | grep cbtemulator
    3. Run the following program with BIGTABLE_EMULATOR_HOST=localhost:8086 python3 bug.py $PID
    from google.api_core import exceptions
    from google.cloud import bigtable
    from google.rpc import code_pb2
    from google.rpc import status_pb2
    import os
    import signal
    import sys
    
    COLUMN_FAMILY_ID = 'column_family_id'
    
    def main():
        emulator_pid = int(sys.argv[1])
        client = bigtable.Client(project="testing", admin=True)
        instance = client.instance("emulator")
    
        # create/open a table
        table = instance.table("emulator_table")
        column_family = table.column_family(COLUMN_FAMILY_ID)
        try:
            table.create()
            column_family.create()
        except exceptions.AlreadyExists:
            print('table exists')
    
        # write a bunch of data
        for i in range(1000):
            k = 'some_key_{:04d}'.format(i)
            print(k)
            row = table.row(k)
            row.set_cell(COLUMN_FAMILY_ID, 'column', 'some_value{:d}'.format(i) * 1000)
            result = table.mutate_rows([row])
            assert len(result) == 1 and result[0].code == code_pb2.OK
            assert table.read_row(k) is not None
    
        print('starting read')
        rows = table.read_rows(retry=bigtable.table.DEFAULT_RETRY_READ_ROWS.with_deadline(5.0))
        rows_iter = iter(rows)
        r1 = next(rows_iter)
        print('read', r1)
        os.kill(emulator_pid, signal.SIGSTOP)
        print('sent sigstop')
        for r in rows_iter:
            print(r)
        print('done')
    
    
    if __name__ == '__main__':
        main()
    

    Stack trace of hung server (using slightly older version of the google-cloud-bigtable library

    /usr/local/lib/python2.7/threading.py:wait:340
    /usr/local/lib/python2.7/site-packages/grpc/_channel.py:_next:348
    /usr/local/lib/python2.7/site-packages/grpc/_channel.py:next:366
    /usr/local/lib/python2.7/site-packages/google/cloud/bigtable/row_data.py:_read_next:426
    /usr/local/lib/python2.7/site-packages/google/api_core/retry.py:retry_target:179
    /usr/local/lib/python2.7/site-packages/google/api_core/retry.py:retry_wrapped_func:270
    /usr/local/lib/python2.7/site-packages/google/cloud/bigtable/row_data.py:_read_next_response:430
    /usr/local/lib/python2.7/site-packages/google/cloud/bigtable/row_data.py:__iter__:441
    
    api: bigtable type: bug priority: p2 :rotating_light: 
    opened by evanj 6
  • chore(deps): update all dependencies

    chore(deps): update all dependencies

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | google-cloud-monitoring | ==2.11.3 -> ==2.12.0 | age | adoption | passing | confidence | | mock (source) | ==4.0.3 -> ==5.0.0 | age | adoption | passing | confidence |


    Release Notes

    googleapis/python-monitoring

    v2.12.0

    Compare Source

    Features
    • Add typing to proto.Message based class attributes (eaaca48)
    Bug Fixes
    • Add dict typing for client_options (eaaca48)
    • Add metric label example to the snippet (#​509) (48b4e35)
    • Add missing argument description (#​504) (8d54a7e)
    • deps: Require google-api-core >=1.34.0, >=2.11.0 (eaaca48)
    • Drop usage of pkg_resources (eaaca48)
    • Fix timeout default values (eaaca48)
    • Remove duplicate variable declaration (#​503) (99a981c)
    Documentation
    • samples: Snippetgen should call await on the operation coroutine before calling result (eaaca48)
    testing-cabal/mock

    v5.0.0

    Compare Source

    • gh-98624: Add a mutex to unittest.mock.NonCallableMock to protect concurrent access to mock attributes.

    • bpo-43478: Mocks can no longer be used as the specs for other Mocks. As a result, an already-mocked object cannot have an attribute mocked using autospec=True or be the subject of a create_autospec(...) call. This can uncover bugs in tests since these Mock-derived Mocks will always pass certain tests (e.g. isinstance) and builtin assert functions (e.g. assert_called_once_with) will unconditionally pass.

    • bpo-45156: Fixes infinite loop on :func:unittest.mock.seal of mocks created by :func:~unittest.create_autospec.

    • bpo-41403: Make :meth:mock.patch raise a :exc:TypeError with a relevant error message on invalid arg. Previously it allowed a cryptic :exc:AttributeError to escape.

    • gh-91803: Fix an error when using a method of objects mocked with :func:unittest.mock.create_autospec after it was sealed with :func:unittest.mock.seal function.

    • bpo-41877: AttributeError for suspected misspellings of assertions on mocks are now pointing out that the cause are misspelled assertions and also what to do if the misspelling is actually an intended attribute name. The unittest.mock document is also updated to reflect the current set of recognised misspellings.

    • bpo-43478: Mocks can no longer be provided as the specs for other Mocks. As a result, an already-mocked object cannot be passed to mock.Mock(). This can uncover bugs in tests since these Mock-derived Mocks will always pass certain tests (e.g. isinstance) and builtin assert functions (e.g. assert_called_once_with) will unconditionally pass.

    • bpo-45010: Remove support of special method __div__ in :mod:unittest.mock. It is not used in Python 3.

    • gh-84753: :func:inspect.iscoroutinefunction now properly returns True when an instance of :class:unittest.mock.AsyncMock is passed to it. This makes it consistent with behavior of :func:asyncio.iscoroutinefunction. Patch by Mehdi ABAAKOUK.

    • bpo-46852: Remove the undocumented private float.__set_format__() method, previously known as float.__setformat__() in Python 3.7. Its docstring said: "You probably don't want to use this function. It exists mainly to be used in Python's test suite." Patch by Victor Stinner.

    • gh-98086: Make sure patch.dict() can be applied on async functions.

    • gh-100287: Fix the interaction of :func:unittest.mock.seal with :class:unittest.mock.AsyncMock.

    • gh-83076: Instantiation of Mock() and AsyncMock() is now 3.8x faster.

    • bpo-41877: A check is added against misspellings of autospect, auto_spec and set_spec being passed as arguments to patch, patch.object and create_autospec.


    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

    👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    api: bigtable size: xs 
    opened by renovate-bot 0
  • Asynchronous batching

    Asynchronous batching

    Call the async mutate_rows from grpc aio

    Create flush and mutate_coroutines

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

    • [ ] Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
    • [ ] Ensure the tests and linter pass
    • [ ] Code coverage does not decrease (if any source code was changed)
    • [ ] Appropriate docs were updated (if necessary)

    Fixes #<issue_number_goes_here> 🦕

    api: bigtable size: m 
    opened by Mariatta 0
  • tests.system.test_table_admin: many tests failed

    tests.system.test_table_admin: many tests failed

    Many tests failed at the same time in this package.

    • I will close this issue when there are no more failures in this package and there is at least one pass.
    • No new issues will be filed for this package until this issue is closed.
    • If there are already issues for individual test cases, I will close them when the corresponding test passes. You can close them earlier, if you prefer, and I won't reopen them while this issue is still open.

    Here are the tests that failed:

    • test_instance_list_tables
    • test_table_exists
    • test_table_create
    • test_table_create_w_families
    • test_table_create_w_split_keys
    • test_column_family_create (#625)
    • test_column_family_update
    • test_column_family_delete (#637)
    • test_table_get_iam_policy
    • test_table_set_iam_policy
    • test_table_test_iam_permissions
    • test_table_backup (#641)

    commit: 49b780d1816df9a83ba0748e33edb1f6b6543758 buildURL: Build Status, Sponge status: failed

    api: bigtable type: bug priority: p2 flakybot: issue flakybot: flaky 
    opened by flaky-bot[bot] 1
  • tests.system.test_data_api: many tests failed

    tests.system.test_data_api: many tests failed

    Many tests failed at the same time in this package.

    • I will close this issue when there are no more failures in this package and there is at least one pass.
    • No new issues will be filed for this package until this issue is closed.
    • If there are already issues for individual test cases, I will close them when the corresponding test passes. You can close them earlier, if you prefer, and I won't reopen them while this issue is still open.

    Here are the tests that failed:

    • test_table_read_rows_filter_millis
    • test_table_mutate_rows
    • test_table_truncate
    • test_table_drop_by_prefix
    • test_table_read_rows_w_row_set
    • test_rowset_add_row_range_w_pfx
    • test_table_read_row_large_cell
    • test_table_read_row
    • test_table_read_rows
    • test_read_with_label_applied
    • test_access_with_non_admin_client (#621)

    commit: 49b780d1816df9a83ba0748e33edb1f6b6543758 buildURL: Build Status, Sponge status: failed

    api: bigtable type: bug priority: p2 flakybot: issue flakybot: flaky 
    opened by flaky-bot[bot] 1
  • tests.system.test_instance_admin: test_instance_create_prod failed

    tests.system.test_instance_admin: test_instance_create_prod failed

    Note: #457 was also for this test, but it was closed more than 10 days ago. So, I didn't mark it flaky.


    commit: d2bfca61c6ba09ee6a5d85c1e7bfbbe72991efa6 buildURL: Build Status, Sponge status: failed

    Test output
    target = functools.partial(>)
    predicate = .if_exception_type_predicate at 0x7f92971659d0>
    sleep_generator = 
    deadline = 60, on_error = None
    
    def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
        """Call a function and retry if it fails.
    
        This is the lowest-level retry helper. Generally, you'll use the
        higher-level retry helper :class:`Retry`.
    
        Args:
            target(Callable): The function to call and retry. This must be a
                nullary function - apply arguments with `functools.partial`.
            predicate (Callable[Exception]): A callable used to determine if an
                exception raised by the target should be considered retryable.
                It should return True to retry or False otherwise.
            sleep_generator (Iterable[float]): An infinite iterator that determines
                how long to sleep between retries.
            deadline (float): How long to keep retrying the target. The last sleep
                period is shortened as necessary, so that the last retry runs at
                ``deadline`` (and not considerably beyond it).
            on_error (Callable[Exception]): A function to call while processing a
                retryable exception.  Any error raised by this function will *not*
                be caught.
    
        Returns:
            Any: the return value of the target function.
    
        Raises:
            google.api_core.RetryError: If the deadline is exceeded while retrying.
            ValueError: If the sleep generator stops yielding values.
            Exception: If the target raises a method that isn't retryable.
        """
        if deadline is not None:
            deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
                seconds=deadline
            )
        else:
            deadline_datetime = None
    
        last_exc = None
    
        for sleep in sleep_generator:
            try:
    
              return target()
    

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/retry.py:190:


    self = <google.api_core.operation.Operation object at 0x7f92945716d0> retry = <google.api_core.retry.Retry object at 0x7f929716a340>

    def _done_or_raise(self, retry=DEFAULT_RETRY):
        """Check if the future is done and raise if it's not."""
        kwargs = {} if retry is DEFAULT_RETRY else {"retry": retry}
    
        if not self.done(**kwargs):
    
          raise _OperationNotComplete()
    

    E google.api_core.future.polling._OperationNotComplete

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/future/polling.py:89: _OperationNotComplete

    The above exception was the direct cause of the following exception:

    self = <google.api_core.operation.Operation object at 0x7f92945716d0> timeout = 60, retry = <google.api_core.retry.Retry object at 0x7f929716a340>

    def _blocking_poll(self, timeout=None, retry=DEFAULT_RETRY):
        """Poll and wait for the Future to be resolved.
    
        Args:
            timeout (int):
                How long (in seconds) to wait for the operation to complete.
                If None, wait indefinitely.
        """
        if self._result_set:
            return
    
        retry_ = self._retry.with_deadline(timeout)
    
        try:
            kwargs = {} if retry is DEFAULT_RETRY else {"retry": retry}
    
          retry_(self._done_or_raise)(**kwargs)
    

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/future/polling.py:110:


    args = (), kwargs = {} target = functools.partial(<bound method PollingFuture._done_or_raise of <google.api_core.operation.Operation object at 0x7f92945716d0>>) sleep_generator = <generator object exponential_sleep_generator at 0x7f9295602e40>

    @functools.wraps(func)
    def retry_wrapped_func(*args, **kwargs):
        """A wrapper that calls target function with retry."""
        target = functools.partial(func, *args, **kwargs)
        sleep_generator = exponential_sleep_generator(
            self._initial, self._maximum, multiplier=self._multiplier
        )
    
      return retry_target(
    
            target,
            self._predicate,
            sleep_generator,
            self._deadline,
            on_error=on_error,
        )
    

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/retry.py:283:


    target = functools.partial(<bound method PollingFuture._done_or_raise of <google.api_core.operation.Operation object at 0x7f92945716d0>>) predicate = <function if_exception_type..if_exception_type_predicate at 0x7f92971659d0> sleep_generator = <generator object exponential_sleep_generator at 0x7f9295602e40> deadline = 60, on_error = None

    def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
        """Call a function and retry if it fails.
    
        This is the lowest-level retry helper. Generally, you'll use the
        higher-level retry helper :class:`Retry`.
    
        Args:
            target(Callable): The function to call and retry. This must be a
                nullary function - apply arguments with `functools.partial`.
            predicate (Callable[Exception]): A callable used to determine if an
                exception raised by the target should be considered retryable.
                It should return True to retry or False otherwise.
            sleep_generator (Iterable[float]): An infinite iterator that determines
                how long to sleep between retries.
            deadline (float): How long to keep retrying the target. The last sleep
                period is shortened as necessary, so that the last retry runs at
                ``deadline`` (and not considerably beyond it).
            on_error (Callable[Exception]): A function to call while processing a
                retryable exception.  Any error raised by this function will *not*
                be caught.
    
        Returns:
            Any: the return value of the target function.
    
        Raises:
            google.api_core.RetryError: If the deadline is exceeded while retrying.
            ValueError: If the sleep generator stops yielding values.
            Exception: If the target raises a method that isn't retryable.
        """
        if deadline is not None:
            deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
                seconds=deadline
            )
        else:
            deadline_datetime = None
    
        last_exc = None
    
        for sleep in sleep_generator:
            try:
                return target()
    
            # pylint: disable=broad-except
            # This function explicitly must deal with broad exceptions.
            except Exception as exc:
                if not predicate(exc):
                    raise
                last_exc = exc
                if on_error is not None:
                    on_error(exc)
    
            now = datetime_helpers.utcnow()
    
            if deadline_datetime is not None:
                if deadline_datetime <= now:
    
                  raise exceptions.RetryError(
    
                        "Deadline of {:.1f}s exceeded while calling target function".format(
                            deadline
                        ),
                        last_exc,
                    ) from last_exc
    

    E google.api_core.exceptions.RetryError: Deadline of 60.0s exceeded while calling target function, last exception:

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/retry.py:205: RetryError

    During handling of the above exception, another exception occurred:

    admin_client = <google.cloud.bigtable.client.Client object at 0x7f9296eb1d60> unique_suffix = '-1661370419528', location_id = 'us-central1-c' instance_labels = {'python-system': '2022-08-24t19-46-59'} instances_to_delete = [<google.cloud.bigtable.instance.Instance object at 0x7f9294571640>] skip_on_emulator = None

    def test_instance_create_prod(
        admin_client,
        unique_suffix,
        location_id,
        instance_labels,
        instances_to_delete,
        skip_on_emulator,
    ):
        from google.cloud.bigtable import enums
    
        alt_instance_id = f"ndef{unique_suffix}"
        instance = admin_client.instance(alt_instance_id, labels=instance_labels)
        alt_cluster_id = f"{alt_instance_id}-cluster"
        serve_nodes = 1
        cluster = instance.cluster(
            alt_cluster_id,
            location_id=location_id,
            serve_nodes=serve_nodes,
        )
    
        operation = instance.create(clusters=[cluster])
        instances_to_delete.append(instance)
    
      operation.result(timeout=60)  # Ensure the operation completes.
    

    tests/system/test_instance_admin.py:166:


    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/future/polling.py:132: in result self._blocking_poll(timeout=timeout, **kwargs)


    self = <google.api_core.operation.Operation object at 0x7f92945716d0> timeout = 60, retry = <google.api_core.retry.Retry object at 0x7f929716a340>

    def _blocking_poll(self, timeout=None, retry=DEFAULT_RETRY):
        """Poll and wait for the Future to be resolved.
    
        Args:
            timeout (int):
                How long (in seconds) to wait for the operation to complete.
                If None, wait indefinitely.
        """
        if self._result_set:
            return
    
        retry_ = self._retry.with_deadline(timeout)
    
        try:
            kwargs = {} if retry is DEFAULT_RETRY else {"retry": retry}
            retry_(self._done_or_raise)(**kwargs)
        except exceptions.RetryError:
    
          raise concurrent.futures.TimeoutError(
    
                "Operation did not complete within the designated " "timeout."
            )
    

    E concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

    .nox/system-3-8/lib/python3.8/site-packages/google/api_core/future/polling.py:112: TimeoutError

    api: bigtable type: bug priority: p2 flakybot: issue flakybot: flaky 
    opened by flaky-bot[bot] 1
Releases(v2.14.1)
Owner
Google APIs
Clients for Google APIs and tools that help produce them.
Google APIs
A supercharged SQLite library for Python

SuperSQLite: a supercharged SQLite library for Python A feature-packed Python package and for utilizing SQLite in Python by Plasticity. It is intended

Plasticity 703 Dec 30, 2022
Find graph motifs using intuitive notation

d o t m o t i f Find graph motifs using intuitive notation DotMotif is a library that identifies subgraphs or motifs in a large graph. It looks like t

APL BRAIN 45 Jan 02, 2023
A Python-based RPC-like toolkit for interfacing with QuestDB.

pykit A Python-based RPC-like toolkit for interfacing with QuestDB. Requirements Python 3.9 Java Azul

QuestDB 11 Aug 03, 2022
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases

Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and

halcyon[nouveau] 15 Dec 29, 2022
Async database support for Python. 🗄

Databases Databases gives you simple asyncio support for a range of databases. It allows you to make queries using the powerful SQLAlchemy Core expres

Encode 3.2k Dec 30, 2022
python-beryl, a Python driver for BerylDB.

python-beryl, a Python driver for BerylDB.

BerylDB 3 Nov 24, 2021
#crypto #cipher #encode #decode #hash

🌹 CYPHER TOOLS 🌹 Written by TMRSWRR Version 1.0.0 All in one tools for CRYPTOLOGY. Instagram: Capture the Root 🖼️ Screenshots 🖼️ 📹 How to use 📹

50 Dec 23, 2022
A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching.

:: Description :: A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching :: Requirements :: Requires P

ESN Social Software 549 Nov 18, 2022
A Python library for Cloudant and CouchDB

Cloudant Python Client This is the official Cloudant library for Python. Installation and Usage Getting Started API Reference Related Documentation De

Cloudant 162 Dec 19, 2022
A Pythonic, object-oriented interface for working with MongoDB.

PyMODM MongoDB has paused the development of PyMODM. If there are any users who want to take over and maintain this project, or if you just have quest

mongodb 345 Dec 25, 2022
Records is a very simple, but powerful, library for making raw SQL queries to most relational databases.

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Kenneth Reitz 6.9k Jan 03, 2023
Familiar asyncio ORM for python, built with relations in mind

Tortoise ORM Introduction Tortoise ORM is an easy-to-use asyncio ORM (Object Relational Mapper) inspired by Django. Tortoise ORM was build with relati

Tortoise 3.3k Dec 31, 2022
PostgreSQL database access simplified

Queries: PostgreSQL Simplified Queries is a BSD licensed opinionated wrapper of the psycopg2 library for interacting with PostgreSQL. The popular psyc

Gavin M. Roy 251 Oct 25, 2022
An extension package of 🤗 Datasets that provides support for executing arbitrary SQL queries on HF datasets

datasets_sql A 🤗 Datasets extension package that provides support for executing arbitrary SQL queries on HF datasets. It uses DuckDB as a SQL engine

Mario Šaško 19 Dec 15, 2022
Class to connect to XAMPP MySQL Database

MySQL-DB-Connection-Class Class to connect to XAMPP MySQL Database Basta fazer o download o mysql_connect.py e modificar os parâmetros que quiser. E d

Alexandre Pimentel 4 Jul 12, 2021
The Database Toolkit for Python

SQLAlchemy The Python SQL Toolkit and Object Relational Mapper Introduction SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that giv

SQLAlchemy 6.5k Jan 01, 2023
Asynchronous, fast, pythonic DynamoDB Client

AsyncIO DynamoDB Asynchronous pythonic DynamoDB client; 2x faster than aiobotocore/boto3/botocore. Quick start With httpx Install this library pip ins

HENNGE 48 Dec 18, 2022
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc.

Simple DDL Parser Build with ply (lex & yacc in python). A lot of samples in 'tests/. Is it Stable? Yes, library already has about 5000+ usage per day

Iuliia Volkova 95 Jan 05, 2023
Sample code to extract data directly from the NetApp AIQUM MySQL Database

This sample code shows how to connect to the AIQUM Database and pull user quota details from it. AIQUM Requirements: 1. AIQUM 9.7 or higher. 2. An

1 Nov 08, 2021
PubMed Mapper: A Python library that map PubMed XML to Python object

pubmed-mapper: A Python Library that map PubMed XML to Python object 中文文档 1. Philosophy view UML Programmatically access PubMed article is a common ta

灵魂工具人 33 Dec 08, 2022