DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

Last update: Dec 19, 2022

Overview

DeepConsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

Installation

From pip package

pip install deepconsensus==0.1.0

You can ignore errors regarding google-nucleus installation, such as ERROR: Failed building wheel for google-nucleus.

From source

git clone https://github.com/google/deepconsensus.git
cd deepconsensus
source install.sh

(Optional) After source install.sh, if you want to run all unit tests, you can do:

./run_all_tests.sh

Usage

See the quick start.

Where does DeepConsensus fit into my pipeline?

After a PacBio sequencing run, DeepConsensus is meant to be run on the CCS reads and subreads to create new corrected reads in FASTQ format that can take the place of the CCS reads for downstream analyses.

See the quick start for an example of inputs and outputs.

NOTE: This initial release of DeepConsensus (v0.1) is not yet optimized for speed, and only runs on CPUs. We anticipate this version to be too slow for many uses. We are now prioritizing speed improvements, which we anticipate can achieve acceptable runtimes.

How to cite

If you are using DeepConsensus in your work, please cite:

DeepConsensus: Gap-Aware Sequence Transformers for Sequence Correction

Disclaimer

This is not an official Google product.

NOTE: the content of this research code repository (i) is not intended to be a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

Comments

Error: ValueError: Shapes (2640, 280) and (560, 280) are incompatible

Hi, I'm getting the below error:

Total params: 9,525,067
Trainable params: 9,525,067
Non-trainable params: 0
_________________________________________________________________
Traceback (most recent call last):
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 130, in restore
    assigned_variable = resource_variable_ops.shape_safe_assign_variable_handle(
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 308, in shape_safe_assign_variable_handle
    shape.assert_is_compatible_with(value_tensor.shape)
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py", line 1291, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (2640, 280) and (560, 280) are incompatible

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 111, in run
    app.run(main, flags_parser=parse_flags)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 102, in main
    app.run(quick_inference.main, argv=passed)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 814, in main
    outcome_counter = run()
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 734, in run
    loaded_model, model_params = initialize_model(
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 476, in initialize_model
    checkpoint.restore(
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2537, in restore
    status = self.read(save_path, options=options)
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2417, in read
    result = self._saver.restore(save_path=save_path, options=options)
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1468, in restore
    base.CheckpointPosition(
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 295, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 1060, in _restore_from_checkpoint_position
    current_position.checkpoint.restore_saveables(tensor_saveables,
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 349, in restore_saveables
    new_restore_ops = functional_saver.MultiDeviceSaver(
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 415, in restore
    restore_ops = restore_fn()
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 398, in restore_fn
    restore_ops.update(saver.restore(file_prefix, options))
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 112, in restore
    restore_ops[saveable.name] = saveable.restore(
  File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 133, in restore
    raise ValueError(
ValueError: Received incompatible tensor with shape (560, 280) when attempting to restore variable with shape (2640, 280) and name model/transformer_input_condenser/kernel/.ATTRIBUTES/VARIABLE_VALUE.

opened by gevro 15

Running deepconsensus results in "free(): invalid pointer" error

I installed deepconsensus via pip in a virtualenv like this:

virtualenv /apps/deepconsensus/1.0.0/python-3.8.2_cpu
source /apps/deepconsensus/1.0.0/python-3.8.2_cpu/bin/activate
pip install pyyaml==5.4.1 'deepconsensus[cpu]==1.0.0'

I used pyyaml==5.4.1 since tf-models-official 2.10.0 requires pyyaml<6.0,>=5.1. I'm using Python 3.8.2.

When I run deepconsensus, even just for the help message, it fails--deepconsensus -h resulted in this error message:

2022-11-10 12:51:17.016037: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
*** Error in `/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python': free(): invalid pointer: 0x00007f075c296c80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f078b91d329]
/lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f075c000192]
/lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f075c0005e3]
/lib64/libstdc++.so.6(+0x71555)[0x7f075c001555]
/lib64/libpthread.so.0(+0x620b)[0x7f078c37920b]
/lib64/libstdc++.so.6(+0x715a1)[0x7f075c0015a1]
/lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f075c0015e3]
/lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f075bffe43c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so(+0xb1150)[0x7f075bdd0150]
/lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f078c7d59c3]
/lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f078c7da59e]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
/lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f078c7d9b8b]
/lib64/libdl.so.2(+0xfab)[0x7f078c16ffab]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
/lib64/libdl.so.2(+0x15ad)[0x7f078c1705ad]
/lib64/libdl.so.2(dlopen+0x31)[0x7f078c170041]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_FindSharedFuncptr+0x16b)[0x539abb]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_LoadDynamicModuleWithSpec+0x159)[0x503e69]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x501a23]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x5f91)[0x428bc1]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x1fb5)[0x424be5]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x3fd)[0x502c8d]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5ee426]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437c24]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x4e6)[0x502d76]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x6e78)[0x429aa8]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyEval_EvalCode+0x23)[0x4e1b43]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5efe34]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
======= Memory map: ========
00400000-006f3000 r-xp 00000000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f2000-008f3000 r--p 002f2000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f3000-0092b000 rw-p 002f3000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
0092b000-0094c000 rw-p 00000000 00:00 0 
01dd2000-03209000 rw-p 00000000 00:00 0                                  [heap]
7f0754000000-7f0754021000 rw-p 00000000 00:00 0 
7f0754021000-7f0758000000 ---p 00000000 00:00 0 
7f075bd1f000-7f075bdbc000 r--p 00000000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bdbc000-7f075bf0b000 r-xp 0009d000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf0b000-7f075bf7f000 r--p 001ec000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf7f000-7f075bf80000 ---p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf80000-7f075bf85000 r--p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf85000-7f075bf8f000 rw-p 00265000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.sozsh: abort      deepconsensus -h

Is there something different I should have done when installing?

opened by mjg0 14

Error (ccs software) - No space left on device (tmp file).

Dear @pichuan,

Using the docker system, I installed and run the tests successfully on the second version of the software in our cluster. Now, during the tests with real data, we found some issues in the ccs software. Usually, we run the software in the nodes, and the output it's printed in the front-end. The front-end has more than 1PB of space, while the nodes only have +-60gb. The tmp files seem to be saved in node, right? It's possible to relocate these temp files to another path?

Below, you can consult the error.

| 20220123 11:27:44.871 | FATAL | Could not write BAM record to /tmp/13552.1.all.q/thread.7_0.ccs.bam | 20220123 11:27:44.982 | FATAL | Caught existing deep IO exception, ignoring thread 13 | 20220123 11:27:44.985 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.985 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in Stage DraftPolish. Pumping buffers empty! | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Exception thrown in CCSWF | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:45.283 | FATAL | Previous exception in DraftStage, aborting thread 9 | 20220123 11:27:45.285 | FATAL | Previous exception in DraftStage, aborting thread 9 | 20220123 11:27:45.405 | FATAL | Previous exception in DraftStage, aborting thread 5 | 20220123 11:27:45.406 | FATAL | Previous exception in DraftStage, aborting thread 5 | 20220123 11:27:45.669 | FATAL | Previous exception in DraftStage, aborting thread 0 | 20220123 11:27:45.672 | FATAL | Previous exception in DraftStage, aborting thread 0 | 20220123 11:27:45.684 | FATAL | Previous exception in DraftStage, aborting thread 14 | 20220123 11:27:45.686 | FATAL | Previous exception in DraftStage, aborting thread 14 | 20220123 11:27:45.864 | FATAL | Previous exception in DraftStage, aborting thread 4 | 20220123 11:27:45.868 | FATAL | Previous exception in DraftStage, aborting thread 4 | 20220123 11:27:45.955 | FATAL | Previous exception in DraftStage, aborting thread 28 | 20220123 11:27:45.957 | FATAL | Previous exception in DraftStage, aborting thread 28 | 20220123 11:27:46.128 | FATAL | Previous exception in DraftStage, aborting thread 29 | 20220123 11:27:46.130 | FATAL | Previous exception in DraftStage, aborting thread 29 | 20220123 11:27:46.157 | FATAL | Previous exception in DraftStage, aborting thread 3 | 20220123 11:27:46.159 | FATAL | Previous exception in DraftStage, aborting thread 3 | 20220123 11:27:46.223 | FATAL | Previous exception in DraftStage, aborting thread 6 | 20220123 11:27:46.224 | FATAL | Previous exception in DraftStage, aborting thread 6 | 20220123 11:27:46.293 | FATAL | Previous exception in DraftStage, aborting thread 24 | 20220123 11:27:46.296 | FATAL | Previous exception in DraftStage, aborting thread 24 | 20220123 11:27:46.549 | FATAL | Previous exception in DraftStage, aborting thread 16 | 20220123 11:27:46.551 | FATAL | Previous exception in DraftStage, aborting thread 16 | 20220123 11:27:46.867 | FATAL | Previous exception in DraftStage, aborting thread 7 | 20220123 11:27:46.868 | FATAL | Previous exception in DraftStage, aborting thread 7 | 20220123 11:27:46.894 | FATAL | Previous exception in DraftStage, aborting thread 15 | 20220123 11:27:46.897 | FATAL | Previous exception in DraftStage, aborting thread 15 | 20220123 11:27:46.959 | FATAL | Previous exception in DraftStage, aborting thread 20 | 20220123 11:27:46.963 | FATAL | Previous exception in DraftStage, aborting thread 20 | 20220123 11:27:47.092 | FATAL | Previous exception in DraftStage, aborting thread 8 | 20220123 11:27:47.095 | FATAL | Previous exception in DraftStage, aborting thread 8 | 20220123 11:27:47.176 | FATAL | Previous exception in DraftStage, aborting thread 27 | 20220123 11:27:47.177 | FATAL | Previous exception in DraftStage, aborting thread 27 | 20220123 11:27:47.538 | FATAL | Previous exception in DraftStage, aborting thread 23 | 20220123 11:27:47.542 | FATAL | Previous exception in DraftStage, aborting thread 23 | 20220123 11:27:47.653 | FATAL | Previous exception in DraftStage, aborting thread 21 | 20220123 11:27:47.655 | FATAL | Previous exception in DraftStage, aborting thread 19 | 20220123 11:27:47.657 | FATAL | Previous exception in DraftStage, aborting thread 19 | 20220123 11:27:47.658 | FATAL | Previous exception in DraftStage, aborting thread 21 | 20220123 11:27:47.731 | FATAL | Previous exception in DraftStage, aborting thread 10 | 20220123 11:27:47.733 | FATAL | Previous exception in DraftStage, aborting thread 10 | 20220123 11:27:47.781 | FATAL | Previous exception in DraftStage, aborting thread 1 | 20220123 11:27:47.783 | FATAL | Previous exception in DraftStage, aborting thread 1 | 20220123 11:27:47.789 | FATAL | Previous exception in DraftStage, aborting thread 17 | 20220123 11:27:47.791 | FATAL | Previous exception in DraftStage, aborting thread 17 | 20220123 11:27:47.915 | FATAL | Previous exception in DraftStage, aborting thread 18 | 20220123 11:27:47.916 | FATAL | Previous exception in DraftStage, aborting thread 18 | 20220123 11:27:47.933 | FATAL | Previous exception in DraftStage, aborting thread 22 | 20220123 11:27:47.934 | FATAL | Previous exception in DraftStage, aborting thread 22 | 20220123 11:27:48.426 | FATAL | Previous exception in DraftStage, aborting thread 12 | 20220123 11:27:48.427 | FATAL | Previous exception in DraftStage, aborting thread 12 | 20220123 11:27:48.475 | FATAL | Previous exception in DraftStage, abortinccsg thread 26 | 20220123 11:27:48.478 | FATAL | Previous exception in DraftStage, aborting thread 26 | 20220123 11:27:50.215 | FATAL | Previous exception in DraftStage, aborting thread 25 | 20220123 11:27:50.218 | FATAL | Previous exception in DraftStage, aborting thread 25 | 20220123 11:27:50.302 | FATAL | Previous exception in DraftStage, aborting thread 11 | 20220123 11:27:50.305 | FATAL | Previous exception in DraftStage, aborting thread 11 | 20220123 11:27:51.195 | FATAL | Previous exception in DraftStage, aborting thread 2 | 20220123 11:27:51.199 | FATAL | Previous exception in DraftStage, aborting thread 2 | 20220123 11:27:52.068 | FATAL | ccs ERROR: [pbbam] BAM writer ERROR: could not write record: file: /tmp/13552.1.all.q/thread.7_0.ccs.bam.tmp reason: No space left on device

Best Regard

André

opened by AMMMachado 12
Training tutorial?

Curious if there will be a tutorial (sorry if I missed it somewhere in repo) for training to make a custom DeepConsensus model for other than human PacBio HiFi reads? I tried DeepConsensus on Bacterial PacBio HiFi/CCS reads, and as expected, it does not perform as well as it does for Human.

opened by jelber2 11
Chunking with deepconsensus

Hi, Is there a way to do chunking with deepconsensus? For datasets where ccs was previously run and then chunks were merged, it would help, because then deepconsensus can do some form of chunking rather than having to redo ccs chunking back on the original subreads.

opened by gevro 7

CPU installation on HPC, no sudo, no docker/singularity

Hello! :wave:

I am trying to install DeepConsensus on an HPC environment (no GPU) without root permissions, and without Docker/Singularity access. I am approaching this by making a deepconsensus venv and trying to install with a modified install script. I have done the following steps:

python3 -m venv $SCRATCH/venvs/deepconsensus_venv_1
source deepconsensus_venv_1/bin/activate
source install_edit.sh , where install_edit.sh skips the apt-get steps and just upgrades pip + installs requirements.txt & intel-tensorflow for the venv

Here are the contents of install_edit.sh:

#!/bin/bash
# Copyright (c) 2021, Google Inc.
# All rights reserved.
# 
# Redistribution and use in source and binary forms, with or without modification,
# are permitted provided that the following conditions are met:
# 
# 1. Redistributions of source code must retain the above copyright notice, this
#    list of conditions and the following disclaimer.
# 
# 2. Redistributions in binary form must reproduce the above copyright notice,
#    this list of conditions and the following disclaimer in the documentation
#    and/or other materials provided with the distribution.
# 
# 3. Neither the name of Google Inc. nor the names of its contributors
#    may be used to endorse or promote products derived from this software without
#    specific prior written permission.
# 
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
# ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
# ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Usage:  source install.sh
#
# This script installs all the packages required to build DeepConsensus.
#
# This script will run as-is on Ubuntu 20.04.
#
# We also assume that apt-get is already installed and available.

function note_build_stage {
  echo "========== [$(date)] Stage '${1}' starting"
}

# Update package list
################################################################################

# Install pip
################################################################################
python3 -m pip install --upgrade pip

# Update PATH so that newly installed pip is the one we actually use.
export PATH="$SCRATCH/venvs/deepconsensus_venv_1/bin:$PATH"
echo "$(pip --version)"

# Install python packages used by DeepConsensus.
################################################################################
python3 -m pip install -r requirements.txt
python3 -m pip install "intel-tensorflow>=2.4.0,<=2.7.0"

And here is the output from running that install script:

(deepconsensus_venv_1) [[email protected] /lustre/fs5/vgl/scratch/labueg/deepconsensus]$ source install_edit.sh 
Collecting pip
  Using cached pip-22.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.2.3
    Uninstalling pip-20.2.3:
      Successfully uninstalled pip-20.2.3
Successfully installed pip-22.1.2
pip 22.1.2 from /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages/pip (python 3.8)
Collecting numpy>=1.19
  Using cached numpy-1.23.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
Collecting pandas>=1.1
  Using cached pandas-1.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
Collecting tf-models-official<=2.7.0,>=2.4.0
  Using cached tf_models_official-2.7.0-py2.py3-none-any.whl (1.8 MB)
Collecting ml_collections>=0.1.0
  Using cached ml_collections-0.1.1.tar.gz (77 kB)
  Preparing metadata (setup.py) ... done
Collecting absl-py>=0.13.0
  Using cached absl_py-1.1.0-py3-none-any.whl (123 kB)
Collecting pysam
  Using cached pysam-0.19.1.tar.gz (3.9 MB)
  Preparing metadata (setup.py) ... done
Collecting python-dateutil>=2.8.1
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pytz>=2020.1
  Using cached pytz-2022.1-py2.py3-none-any.whl (503 kB)
Collecting py-cpuinfo>=3.3.0
  Using cached py-cpuinfo-8.0.0.tar.gz (99 kB)
  Preparing metadata (setup.py) ... done
Collecting opencv-python-headless
  Using cached opencv_python_headless-4.6.0.66-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (48.3 MB)
Collecting tensorflow-datasets
  Using cached tensorflow_datasets-4.6.0-py3-none-any.whl (4.3 MB)
Collecting gin-config
  Using cached gin_config-0.5.0-py3-none-any.whl (61 kB)
Collecting tensorflow-hub>=0.6.0
  Using cached tensorflow_hub-0.12.0-py2.py3-none-any.whl (108 kB)
Collecting tensorflow-text>=2.7.0
  Using cached tensorflow_text-2.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB)
Collecting Cython
  Using cached Cython-0.29.30-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
Collecting oauth2client
  Using cached oauth2client-4.1.3-py2.py3-none-any.whl (98 kB)
Collecting scipy>=0.19.1
  Using cached scipy-1.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (41.6 MB)
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting tensorflow-model-optimization>=0.4.1
  Using cached tensorflow_model_optimization-0.7.2-py2.py3-none-any.whl (237 kB)
Collecting pycocotools
  Using cached pycocotools-2.0.4-cp38-cp38-linux_x86_64.whl
Collecting kaggle>=1.3.9
  Using cached kaggle-1.5.12.tar.gz (58 kB)
  Preparing metadata (setup.py) ... done
Collecting tensorflow>=2.7.0
  Using cached tensorflow-2.9.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.7 MB)
Collecting matplotlib
  Using cached matplotlib-3.5.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
Collecting seqeval
  Using cached seqeval-1.2.2.tar.gz (43 kB)
  Preparing metadata (setup.py) ... done
Collecting tensorflow-addons
  Using cached tensorflow_addons-0.17.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
Collecting pyyaml>=5.1
  Using cached PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)
Collecting google-api-python-client>=1.6.7
  Using cached google_api_python_client-2.51.0-py2.py3-none-any.whl (8.6 MB)
Collecting psutil>=5.4.3
  Using cached psutil-5.9.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (284 kB)
Collecting sacrebleu
  Using cached sacrebleu-2.1.0-py3-none-any.whl (92 kB)
Collecting Pillow
  Using cached Pillow-9.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
Collecting sentencepiece
  Using cached sentencepiece-0.1.96-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
Collecting tf-slim>=1.1.0
  Using cached tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
Collecting contextlib2
  Using cached contextlib2-21.6.0-py2.py3-none-any.whl (13 kB)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5
  Using cached google_api_core-2.8.2-py3-none-any.whl (114 kB)
Collecting google-auth-httplib2>=0.1.0
  Using cached google_auth_httplib2-0.1.0-py2.py3-none-any.whl (9.3 kB)
Collecting google-auth<3.0.0dev,>=1.16.0
  Using cached google_auth-2.8.0-py2.py3-none-any.whl (164 kB)
Collecting httplib2<1dev,>=0.15.0
  Using cached httplib2-0.20.4-py3-none-any.whl (96 kB)
Collecting uritemplate<5,>=3.0.1
  Using cached uritemplate-4.1.1-py2.py3-none-any.whl (10 kB)
Collecting certifi
  Using cached certifi-2022.6.15-py3-none-any.whl (160 kB)
Collecting requests
  Using cached requests-2.28.0-py3-none-any.whl (62 kB)
Collecting tqdm
  Using cached tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
Collecting python-slugify
  Using cached python_slugify-6.1.2-py2.py3-none-any.whl (9.4 kB)
Collecting urllib3
  Using cached urllib3-1.26.9-py2.py3-none-any.whl (138 kB)
Collecting grpcio<2.0,>=1.24.3
  Using cached grpcio-1.47.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.5 MB)
Collecting tensorboard<2.10,>=2.9
  Using cached tensorboard-2.9.1-py3-none-any.whl (5.8 MB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1
  Using cached tensorflow_io_gcs_filesystem-0.26.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.4 MB)
Collecting libclang>=13.0.0
  Using cached libclang-14.0.1-py2.py3-none-manylinux1_x86_64.whl (14.5 MB)
Collecting protobuf<3.20,>=3.9.2
  Using cached protobuf-3.19.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
Collecting keras-preprocessing>=1.1.1
  Using cached Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
Collecting opt-einsum>=2.3.2
  Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Collecting google-pasta>=0.1.1
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting flatbuffers<2,>=1.12
  Using cached flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
Collecting h5py>=2.9.0
  Using cached h5py-3.7.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (4.5 MB)
Collecting gast<=0.4.0,>=0.2.1
  Using cached gast-0.4.0-py3-none-any.whl (9.8 kB)
Collecting wrapt>=1.11.0
  Using cached wrapt-1.14.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81 kB)
Collecting packaging
  Using cached packaging-21.3-py3-none-any.whl (40 kB)
Collecting tensorflow-estimator<2.10.0,>=2.9.0rc0
  Using cached tensorflow_estimator-2.9.0-py2.py3-none-any.whl (438 kB)
Requirement already satisfied: setuptools in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorflow>=2.7.0->tf-models-official<=2.7.0,>=2.4.0->-r requirements.txt (line 3)) (49.2.1)
Collecting keras<2.10.0,>=2.9.0rc0
  Using cached keras-2.9.0-py2.py3-none-any.whl (1.6 MB)
Collecting astunparse>=1.6.0
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting termcolor>=1.1.0
  Using cached termcolor-1.1.0.tar.gz (3.9 kB)
  Preparing metadata (setup.py) ... done
Collecting typing-extensions>=3.6.6
  Using cached typing_extensions-4.2.0-py3-none-any.whl (24 kB)
Collecting dm-tree~=0.1.1
  Using cached dm_tree-0.1.7-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (142 kB)
Collecting kiwisolver>=1.0.1
  Using cached kiwisolver-1.4.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.2 MB)
Collecting pyparsing>=2.2.1
  Using cached pyparsing-3.0.9-py3-none-any.whl (98 kB)
Collecting cycler>=0.10
  Using cached cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting fonttools>=4.22.0
  Using cached fonttools-4.33.3-py3-none-any.whl (930 kB)
Collecting pyasn1-modules>=0.0.5
  Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
Collecting pyasn1>=0.1.7
  Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
Collecting rsa>=3.1.4
  Using cached rsa-4.8-py3-none-any.whl (39 kB)
Collecting colorama
  Using cached colorama-0.4.5-py2.py3-none-any.whl (16 kB)
Collecting regex
  Using cached regex-2022.6.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (764 kB)
Collecting portalocker
  Using cached portalocker-2.4.0-py2.py3-none-any.whl (16 kB)
Collecting tabulate>=0.8.9
  Using cached tabulate-0.8.10-py3-none-any.whl (29 kB)
Collecting scikit-learn>=0.21.3
  Using cached scikit_learn-1.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.2 MB)
Collecting typeguard>=2.7
  Using cached typeguard-2.13.3-py3-none-any.whl (17 kB)
Collecting toml
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting etils[epath]
  Using cached etils-0.6.0-py3-none-any.whl (98 kB)
Collecting importlib-resources
  Using cached importlib_resources-5.8.0-py3-none-any.whl (28 kB)
Collecting promise
  Using cached promise-2.3.tar.gz (19 kB)
  Preparing metadata (setup.py) ... done
Collecting tensorflow-metadata
  Using cached tensorflow_metadata-1.9.0-py3-none-any.whl (51 kB)
Collecting dill
  Using cached dill-0.3.5.1-py2.py3-none-any.whl (95 kB)
Collecting wheel<1.0,>=0.23.0
  Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Collecting googleapis-common-protos<2.0dev,>=1.56.2
  Using cached googleapis_common_protos-1.56.3-py2.py3-none-any.whl (211 kB)
Collecting cachetools<6.0,>=2.0.0
  Using cached cachetools-5.2.0-py3-none-any.whl (9.3 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting charset-normalizer~=2.0.0
  Using cached charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Collecting joblib>=1.0.0
  Using cached joblib-1.1.0-py2.py3-none-any.whl (306 kB)
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting werkzeug>=1.0.1
  Using cached Werkzeug-2.1.2-py3-none-any.whl (224 kB)
Collecting tensorboard-data-server<0.7.0,>=0.6.0
  Using cached tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)
Collecting markdown>=2.6.8
  Using cached Markdown-3.3.7-py3-none-any.whl (97 kB)
Collecting google-auth-oauthlib<0.5,>=0.4.1
  Using cached google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
Collecting tensorboard-plugin-wit>=1.6.0
  Using cached tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
Collecting zipp
  Using cached zipp-3.8.0-py3-none-any.whl (5.4 kB)
Collecting text-unidecode>=1.3
  Using cached text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
Collecting requests-oauthlib>=0.7.0
  Using cached requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting importlib-metadata>=4.4
  Using cached importlib_metadata-4.12.0-py3-none-any.whl (21 kB)
Collecting oauthlib>=3.0.0
  Using cached oauthlib-3.2.0-py3-none-any.whl (151 kB)
Using legacy 'setup.py install' for ml_collections, since package 'wheel' is not installed.
Using legacy 'setup.py install' for pysam, since package 'wheel' is not installed.
Using legacy 'setup.py install' for kaggle, since package 'wheel' is not installed.
Using legacy 'setup.py install' for py-cpuinfo, since package 'wheel' is not installed.
Using legacy 'setup.py install' for seqeval, since package 'wheel' is not installed.
Using legacy 'setup.py install' for termcolor, since package 'wheel' is not installed.
Using legacy 'setup.py install' for promise, since package 'wheel' is not installed.
Installing collected packages: text-unidecode, termcolor, tensorboard-plugin-wit, sentencepiece, pytz, pysam, pyasn1, py-cpuinfo, libclang, keras, gin-config, flatbuffers, dm-tree, zipp, wrapt, wheel, werkzeug, urllib3, uritemplate, typing-extensions, typeguard, tqdm, toml, threadpoolctl, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, tabulate, six, rsa, regex, pyyaml, python-slugify, pyparsing, pyasn1-modules, psutil, protobuf, portalocker, Pillow, oauthlib, numpy, kiwisolver, joblib, idna, gast, fonttools, etils, dill, Cython, cycler, contextlib2, colorama, charset-normalizer, certifi, cachetools, absl-py, tf-slim, tensorflow-model-optimization, tensorflow-hub, scipy, sacrebleu, requests, python-dateutil, promise, packaging, opt-einsum, opencv-python-headless, ml_collections, keras-preprocessing, importlib-resources, importlib-metadata, httplib2, h5py, grpcio, googleapis-common-protos, google-pasta, google-auth, astunparse, tensorflow-metadata, tensorflow-addons, scikit-learn, requests-oauthlib, pandas, oauth2client, matplotlib, markdown, kaggle, google-auth-httplib2, google-api-core, tensorflow-datasets, seqeval, pycocotools, google-auth-oauthlib, google-api-python-client, tensorboard, tensorflow, tensorflow-text, tf-models-official
  Running setup.py install for termcolor ... done
  Running setup.py install for pysam ... done
  Running setup.py install for py-cpuinfo ... done
  Running setup.py install for promise ... done
  Running setup.py install for ml_collections ... done
  Running setup.py install for kaggle ... done
  Running setup.py install for seqeval ... done
Successfully installed Cython-0.29.30 Pillow-9.1.1 absl-py-1.1.0 astunparse-1.6.3 cachetools-5.2.0 certifi-2022.6.15 charset-normalizer-2.0.12 colorama-0.4.5 contextlib2-21.6.0 cycler-0.11.0 dill-0.3.5.1 dm-tree-0.1.7 etils-0.6.0 flatbuffers-1.12 fonttools-4.33.3 gast-0.4.0 gin-config-0.5.0 google-api-core-2.8.2 google-api-python-client-2.51.0 google-auth-2.8.0 google-auth-httplib2-0.1.0 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 googleapis-common-protos-1.56.3 grpcio-1.47.0 h5py-3.7.0 httplib2-0.20.4 idna-3.3 importlib-metadata-4.12.0 importlib-resources-5.8.0 joblib-1.1.0 kaggle-1.5.12 keras-2.9.0 keras-preprocessing-1.1.2 kiwisolver-1.4.3 libclang-14.0.1 markdown-3.3.7 matplotlib-3.5.2 ml_collections-0.1.1 numpy-1.23.0 oauth2client-4.1.3 oauthlib-3.2.0 opencv-python-headless-4.6.0.66 opt-einsum-3.3.0 packaging-21.3 pandas-1.4.3 portalocker-2.4.0 promise-2.3 protobuf-3.19.4 psutil-5.9.1 py-cpuinfo-8.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycocotools-2.0.4 pyparsing-3.0.9 pysam-0.19.1 python-dateutil-2.8.2 python-slugify-6.1.2 pytz-2022.1 pyyaml-6.0 regex-2022.6.2 requests-2.28.0 requests-oauthlib-1.3.1 rsa-4.8 sacrebleu-2.1.0 scikit-learn-1.1.1 scipy-1.8.1 sentencepiece-0.1.96 seqeval-1.2.2 six-1.16.0 tabulate-0.8.10 tensorboard-2.9.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.9.1 tensorflow-addons-0.17.1 tensorflow-datasets-4.6.0 tensorflow-estimator-2.9.0 tensorflow-hub-0.12.0 tensorflow-io-gcs-filesystem-0.26.0 tensorflow-metadata-1.9.0 tensorflow-model-optimization-0.7.2 tensorflow-text-2.9.0 termcolor-1.1.0 text-unidecode-1.3 tf-models-official-2.7.0 tf-slim-1.1.0 threadpoolctl-3.1.0 toml-0.10.2 tqdm-4.64.0 typeguard-2.13.3 typing-extensions-4.2.0 uritemplate-4.1.1 urllib3-1.26.9 werkzeug-2.1.2 wheel-0.37.1 wrapt-1.14.1 zipp-3.8.0
Collecting intel-tensorflow<=2.7.0,>=2.4.0
  Using cached intel_tensorflow-2.7.0-cp38-cp38-manylinux2010_x86_64.whl (186.4 MB)
Requirement already satisfied: absl-py>=0.4.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.0)
Requirement already satisfied: flatbuffers<3.0,>=1.12 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.12)
Requirement already satisfied: google-pasta>=0.1.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.2.0)
Requirement already satisfied: numpy>=1.14.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.23.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.47.0)
Collecting tensorflow-estimator<2.8,~=2.7.0rc0
  Using cached tensorflow_estimator-2.7.0-py2.py3-none-any.whl (463 kB)
Requirement already satisfied: six>=1.12.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.16.0)
Requirement already satisfied: protobuf>=3.9.2 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.19.4)
Requirement already satisfied: opt-einsum>=2.3.2 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.3.0)
Requirement already satisfied: tensorboard~=2.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (2.9.1)
Requirement already satisfied: h5py>=2.9.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.7.0)
Requirement already satisfied: wheel<1.0,>=0.32.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.37.1)
Requirement already satisfied: termcolor>=1.1.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.0)
Requirement already satisfied: gast<0.5.0,>=0.2.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.4.0)
Requirement already satisfied: libclang>=9.0.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (14.0.1)
Requirement already satisfied: wrapt>=1.11.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.14.1)
Requirement already satisfied: astunparse>=1.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.6.3)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.2)
Collecting keras<2.8,>=2.7.0rc0
  Using cached keras-2.7.0-py2.py3-none-any.whl (1.3 MB)
Requirement already satisfied: typing-extensions>=3.6.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (4.2.0)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.21.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.26.0)
Requirement already satisfied: markdown>=2.6.8 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.3.7)
Requirement already satisfied: requests<3,>=2.21.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.28.0)
Requirement already satisfied: setuptools>=41.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (49.2.1)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.6.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.8.1)
Requirement already satisfied: werkzeug>=1.0.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.1.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.4.6)
Requirement already satisfied: google-auth<3,>=1.6.3 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.8.0)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (5.2.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (4.8)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.2.8)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from markdown>=2.6.8->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (4.12.0)
Requirement already satisfied: charset-normalizer~=2.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.0.12)
Requirement already satisfied: certifi>=2017.4.17 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2022.6.15)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.26.9)
Requirement already satisfied: idna<4,>=2.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.3)
Requirement already satisfied: zipp>=0.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.8.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.2.0)
Installing collected packages: tensorflow-estimator, keras, intel-tensorflow
  Attempting uninstall: tensorflow-estimator
    Found existing installation: tensorflow-estimator 2.9.0
    Uninstalling tensorflow-estimator-2.9.0:
      Successfully uninstalled tensorflow-estimator-2.9.0
  Attempting uninstall: keras
    Found existing installation: keras 2.9.0
    Uninstalling keras-2.9.0:
      Successfully uninstalled keras-2.9.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.9.1 requires keras<2.10.0,>=2.9.0rc0, but you have keras 2.7.0 which is incompatible.
tensorflow 2.9.1 requires tensorflow-estimator<2.10.0,>=2.9.0rc0, but you have tensorflow-estimator 2.7.0 which is incompatible.
Successfully installed intel-tensorflow-2.7.0 keras-2.7.0 tensorflow-estimator-2.7.0

I then ran run_all_tests.sh on a compute node via slurm and the full output is in this gist: https://gist.github.com/abueg/421ca972563b5c32825cde17525a49bf

There is this exception in the output, which leads me to think the installation failed:

Exception ignored in: <function Pool.__del__ at 0x7fe60613b820>
Traceback (most recent call last):
  File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/queues.py", line 368, in put
    self._writer.send_bytes(obj)
  File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

Would this error arise from the tensorflow dependency problem, in which case should I try to install tensorflow 2.9.1? Or have I gone wrong somewhere else?

There is also this error before the exception: [E::idx_find_and_load] Could not retrieve index file for 'deepconsensus/testdata/human_1m/subreads_to_ccs.bam', but there is no index file for that file in the deepconsensus/testdata/human_1m/ directory, so should I index it prior to running the tests?

Any help would be appreciated, thank you in advance!

opened by abueg 7

Error: ModuleNotFoundError: No module named 'pandas._libs.interval'

Hi, Getting this error with new deepconsensus 1.0.0. I'm running the exact same command that worked for the previous deepconsensus version. I'm running from the docker.

2022-10-11 16:17:35.711032: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/__init__.py", line 30, in <module>
    from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
ModuleNotFoundError: No module named 'pandas._libs.interval'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 111, in run
    app.run(main, flags_parser=parse_flags)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 99, in main
    from deepconsensus.inference import quick_inference
  File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 53, in <module>
    import pandas as pd
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/__init__.py", line 34, in <module>
    raise ImportError(
ImportError: C extension: No module named 'pandas._libs.interval' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.

opened by gevro 6

An tutorial for running deepconsensus

My computer hardware equipment look like this:

OS: Ubuntu 20.04.3 LTS (x86_64)
Python version: Python 3.8.10
CPUs: i7 10700k(8c16t, SkyLake)
Memory: 32G
GPU: 1 NVIDIA RTX A4000 8G

Install the requirement packages

Create an environment for deepconsensus using conda

mamba create -n deepconsensus -c bioconda -c conda-forge python=3.8 pbcore pbbam pbccs pbmm2 parallel jq gcc pycocotools bioconda::seqtk bioconda::unimap bioconda::bedtools bioconda::minimap2 bioconda::extracthifi bioconda::zmwfilter bioconda::pysam bioconda::samtools=1.10 bioconda::pyfastx=0.8.4

Download the ACTC for reads mapping

wget https://github.com/PacificBiosciences/align-clr-to-ccs/releases/download/0.1.0/actc 
chmod u+x actc
mv actc PATH/miniconda3/envs/deepconsensus/bin

Install the Deepconsensus[GPU] by using pip

conda activate deepconsensus
pip install deepconsensus[gpu]==0.2.0

Prepare all the needed input file for Deepconsensus

Get the ccs.bam

ccs --all -j 15 raw.subreads.bam out.ccs.bam

Get the subreads_to_ccs.bam

Tips

If you use the actc to map the subreads to ccs without chunks, then you may encounter this error when running the deepconsensus.

I0324 19:48:00.776319 140117319313216 quick_inference.py:492] Processed a batch of 100 ZMWs in 62.39794731140137 seconds
I0324 19:48:00.808807 140117319313216 quick_inference.py:570] Processed 7000 ZMWs in 4584.726703 seconds
Process ForkPoolWorker-1061:
Traceback (most recent call last):
  File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/pool.py", line 131, in worker
    put((job, i, result))
  File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/queues.py", line 368, in put
    self._writer.send_bytes(obj)
  File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 405, in _send_bytes
    self._send(buf)
  File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

This error is caused by the number of stream processors reaching an upper limit as the iteration process increases. To avoid this mistake， the right way is chunking the data when using actc.

Chunking your subreads.bam

### Generating all command lines using shell
for i in {1..1000}; do echo 'actc -j 1 raw.subreads.bam out.ccs.bam subreads_to_ccs.'${i}'.bam --chunk '${i}'/1000' ; done > actc_chunk.job

### Submiting all scripts in parallel using parallel
parallel -j 15 < actc_chunk.job

### Index all the subreads_to_ccs.${i}.fasta
for i in {1..1000}; do echo 'samtools faidx subreads_to_ccs.'${i}'.fasta' ; done > samtools_index.job

parallel -j 15 < samtools_index.job

Get the model for Deepconsensus

mkdir deepconsensus_model && cd deepconsensus_model
wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/params.json
wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/checkpoint-50.index
wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/checkpoint-50.data-00000-of-00001

Run the Deepconsensus

for i in {1..1000};
do
deepconsensus run \
  --subreads_to_ccs=subreads_to_ccs.${i}.bam  \
  --ccs_fasta=subreads_to_ccs.${i}.fasta \
  --checkpoint=deepconsensus_model/checkpoint-50 \
  --output=output.${i}.fastq \
  --batch_zmws=100
done

Merge the output

cat output.*.fastq > total.fastq

opened by shengxinzhuan 6

can deepconsensus run on an arm machine？

hello，deepconsensus team! Thanks for make an amazing job for hifi sequencing. It is useful to work on an x86 cpu or gpu machine. However when I try to install it on an arm hpc, I was fail for installing the base requirements pacakage. Do you have the plan to move the deepconsensus on the arm machine ?

opened by shengxinzhuan 6

Error: StopIteration

Hi, I'm getting this error. What is the cause of this? Thanks.

singularity run -W /data -B /scratch/projects/bin/deepconsensus/model:/model -B pwd /scratch/projects/bin/deepconsensus/deepconsensus_0.3.1.sif deepconsensus run --batch_size=1024 --batch_zmws=100 --cpus 1 --max_passes 20 --subreads_to_ccs=blah.subreads_to_ccs.0018.bam --ccs_bam=blah.ccs.0018.bam --checkpoint=/model/checkpoint --output=blah.fastq

=================================================================
Total params: 8,942,667
Trainable params: 8,942,667
Non-trainable params: 0
_________________________________________________________________
I0809 10:21:26.338892 140397309962048 model_utils.py:231] Setting hidden size to transformer_input_size.
I0809 10:21:26.339057 140397309962048 quick_inference.py:484] Finished initialize_model.
I0809 10:21:26.339549 140397309962048 quick_inference.py:738] Model setup took 1.790560245513916 seconds.
Traceback (most recent call last):
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/preprocess/utils.py", line 981, in proc_feeder
    ccs_bam_read = next(ccs_bam_h)
  File "pysam/libcalignmentfile.pyx", line 1874, in pysam.libcalignmentfile.AlignmentFile.__next__
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 111, in run
    app.run(main, flags_parser=parse_flags)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 102, in main
    app.run(quick_inference.main, argv=passed)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 814, in main
    outcome_counter = run()
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 762, in run
    for zmw, subreads, dc_config in input_file_generator:
  File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 428, in stream_bam
    for input_data in proc_feeder():
RuntimeError: generator raised StopIteration

opened by gevro 5

Docker or Singularity

Hi,

First of all thank you for this amazing program. I'm working in a cluster with the centos7. I tried to install the software several times, always with errors in the pip/Python distributions. Could you make available some docker and/or singularity for the deep consensus? Best Regards Andre

opened by AMMMachado 5
Error detecting params.json using docker in debian (10) HPC

docker run google/deepconsensus:1.1.0 deepconsensus run --subreads_to_ccs=m54274Ue_220814_163631.aligned.subreads.bam --ccs_bam=m54274Ue_220814_163631.hifi_S3_reads.bam --checkpoint=model/checkpoint --output=m54274Ue_220814_163631_deepcon.output.fastq Traceback (most recent call last): File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in sys.exit(run()) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 111, in run app.run(main, flags_parser=parse_flags) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 102, in main app.run(quick_inference.main, argv=passed) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 842, in main outcome_counter = run() File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 703, in run params = model_utils.read_params_from_json(checkpoint_path=FLAGS.checkpoint) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/models/model_utils.py", line 405, in read_params_from_json json.load(tf.io.gfile.GFile(json_path, 'r'))) File "/opt/conda/envs/bio/lib/python3.9/json/init.py", line 293, in load return loads(fp.read(), File "/opt/conda/envs/bio/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 116, in read self._preread_check() File "/opt/conda/envs/bio/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( tensorflow.python.framework.errors_impl.NotFoundError: model/params.json; No such file or directory

I am getting this error even though i have all the files from model checkpoint.data-00000-of-00001 checkpoint.index params.json in the model dir.

opened by ap1438 1
Lower number of >Q30 average quality reads for v1.1 compared to v0.3

Hi all,

I am assembling a genome of a land snail that has extreme repeat content (~85%) and large genome size (6.6 gb). My mean insert size is 8kb and I have data from six SMRT cells.

I have ran DeepConsensus (cpu only) on my six SMRT cells using v0.3 and on two of the SMRT cells using v1.1. I have noticed I have gotten more reads with >Q20 average quality using v1.1 but a lower number of reads that are >Q30 compared to v.0.3. Histograms of average read quality attached. v0.3.qchist.txt v1.1.qchist.txt

Manual inspection of the same reads from either version confirmed that most reads had longer regions of lower quality in v1.1 than v0.3. First 100 reads for v0.3 and v1.1 attached (.txt extension for github upload). v0.3_smrtcell1_100.fastq.txt v1.1_smrtcell1_100.fastq.txt

This was surprising to me as my expectation was that the >Q20 yield would remain relatively constant between versions but >Q30 yield would increase.

Perhaps this is the result of lower insert length or high repeat content of this library? I would appreciate hearing the DeepConsensus team's thoughts on this discrepancy. Any help would be appreciated!

Thanks! Mason

opened by mason-linscott 3
Missing majority of ZMWs after running lima to search for adapters

Hi, I'm not sure if you can help me on this but just want to raise this issue I've encountered after filtering adapters with lima. Not entirely sure why lima filtered out majority of the 'deepconsensus hifi reads'. Below is the script for the deepconsensus and lima:

DC

cmd4="module purge && module load deepconsensus/0.3.1 && deepconsensus run --checkpoint=/cluster/home/dc_model_0.3/checkpoint --ccs_bam=${outDir}/${outFilePrefix}.${SLURM_ARRAY_TASK_ID}.bam --subreads_to_ccs=${outDir}/${outFilePrefix%.ccs}.subreads_to_ccs.${SLURM_ARRAY_TASK_ID}.bam --output=${outDir}/${outFilePrefix%.ccs}.deepconsensus.${SLURM_ARRAY_TASK_ID}.fastq --cpus ${THREAD}"

Lima

lima --num-threads 84 --split-bam-named --same --ccs ${ID}.deepconsensus.fastq /cluster/home/lima_pbmarkdup/pb_pcr_adapter.fa ${ID}.deepconsensus.lima.fastq

Here is the output summary from lima:

ZMWs input (A) : 1925775 ZMWs above all thresholds (B) : 13042 (0.68%) ZMWs below any threshold (C) : 1912733 (99.32%) ZMW marginals for (C): Below min length : 199 (0.01%) Below min score : 648496 (33.90%) Below min end score : 648496 (33.90%) Below min passes : 0 (0.00%) Below min score lead : 648496 (33.90%) Below min ref span : 1912733 (100.00%) Without SMRTbell adapter : 0 (0.00%) ZMWs for (B): With same pair : 13042 (100.00%) Coefficient of correlation : 0.00% ZMWs for (A): Allow diff pair : 1925775 (100.00%) Allow same pair : 1925775 (100.00%) Reads for (B): Above length : 13042 (100.00%) Below length : 0 (0.00%)

Thank you, I appreciate your help!

opened by rosspdu 3

Releases(v1.1.0)

v1.1.0(Dec 16, 2022)
DeepConsensus v1.1 introduces a new model that improves coverage of telomere regions achieved through improved filtering of the training data with CHM13 high confidence regions.

Improved yield at empirical Q30 from 187.1% in v1.0 to 194.4% in v1.1, relative to ccs baseline of 100%. This was achieved through improvements to the attention layer in the model.

Updated the training tutorial for training on TPUs that users can use as a proof-of-concept to develop a training setup.

This release evaluates performance using an updated HG002 truth assembly. We have re-evaluated previous releases with this updated dataset and updated Q30 yields accordingly.

Thanks to Sergey Koren (@skoren) from NIH, NHGRI and the T2T consortium for invaluable feedback on the coverage of telomeric regions.

Thanks to Daniel Liu (@Daniel-Liu-c0deb0t) for incorporating prior knowledge/sparsity in the attention layer of the model, which significantly improved the accuracy and Q30 yield.

Thanks to Armin Töpfer (@armintoepfer), Aaron Wenger (@amwenger), and William Rowell (@williamrowell) at PacBio for advice and collaboration.

Source code(tar.gz)
Source code(zip)
v1.0.0(Oct 11, 2022)
DeepConsensus v1.0 introduces a new model that greatly improves the empirical Q30 yield across chemistries and the insert sizes we tested. For example, using our chem2.2_24kb dataset we observe an increase in Q30 yield from 149% to 176%.

We reduced the size of our model (using distillation) and the size of the model inputs to lower runtime by approximately 10%, while still improving accuracy over v0.3.

DeepConsensus can now output a BAM file. BAM output can be used to examine the effective coverage (ec), number of passes (np), or predicted average read accuracy (rq).

v1.0 introduces a training tutorial that users can use as a proof-of-concept to develop a training setup.

Models introduced previously (v0.1, v0.2, v0.3) are not compatible with v1.0 and vice versa.

--max_passes and --example_width are now defined by the model params.json file. Users do not need to set these flags when running inference. The --padding flag has been removed. Padding is no longer added to model inputs.

Acknowledgements

Thanks to Armin Töpfer (@armintoepfer), Aaron Wenger (@amwenger), and William Rowell (@williamrowell) at PacBio for advice and collaboration.

Thanks to Lucas Brambrink (@lucasbrambrink) for model experiments and analysis.

Thanks to Daniel Liu (@Daniel-Liu-c0deb0t) for model experiments, analysis, and advice.

Source code(tar.gz)
Source code(zip)
v0.3.1(Jul 19, 2022)
Change Log

This patch release reverts the --min-quality flag to use a default value of 20.

Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 6, 2022)
Change Log

Runtime speedup of 4.9X compared to v0.2.

Improved yield at empirical Q30 from 141% in v0.2 to 149%, relative to ccs baseline of 100%. This was achieved through improvements to training data including use of new CHM13 T2T assembly (chm13v2.0_noY) and sequencing.

Added a documentation page with yield metrics for 3 SMRT Cells with different read length distributions.

Updated recommendation for ccs settings to skip very low-quality reads, saving runtime.

Model input condenser layer added, saving runtime.

To save significant runtime, added an option to skip running the model on windows that are already likely to be correct with --skip_windows_above a certain predicted quality from CCS, with a default Q45.

Memory profiling with batch option recommendations.

Add support for TensorFlow SavedModel for portability.

Added base quality calibration tuned for v0.3 model, customizable with --dc_calibration option.

The --min-quailty flag default was changed from 20 to 0 in this version. This change was reverted in v0.3.1.

Acknowledgement

Thanks to Armin Töpfer, Aaron Wenger, and William Rowell at PacBio for advice and collaboration.

Thanks to Felipe Llinares for contributing a new alignment training metric.

Thanks to Moshe Wagner for adding a multiprocessing speedup to the preprocessing stage.

Thanks to Joel Shor for model advice and code reviews.

Source code(tar.gz)
Source code(zip)
v0.2.0(Jan 18, 2022)
Change Log

Substantial (>10x) speed increase relative to v0.1.

DeepConsensus now supports GPU execution. In our tests, using a NVIDIA V100 GPU is ~3.3x faster than CPU alone.

Reduced installation complexity by removing Nucleus and Apache Beam dependencies. Added support for newer TensorFlow versions.

CPU and GPU pip packages are now available alongside corresponding Docker images.

A more user-friendly command-line interface has been added and can be invoked using deepconsensus.

A simplified one-step solution for running DeepConsensus has been developed and can be invoked using deepconsensus run.

Small improvements to accuracy by better mapping repetitive subreads with actc, increasing Q30 yield by 31.3 % relative to pbccs, compared to 30.6% for DeepConsensus v0.1.

Thanks to Armin Töpfer for actc support and Jeremy Schmutz for evaluations and feedback.
Source code(tar.gz)
Source code(zip)
v0.1.0(Jan 15, 2022)

Initial release.

Please see: https://www.biorxiv.org/content/10.1101/2021.08.31.458403v1
Source code(tar.gz)
Source code(zip)

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

Related tags

Overview

DeepConsensus

Installation

From pip package

From source

Usage

Where does DeepConsensus fit into my pipeline?

How to cite

Disclaimer

Comments

Install the requirement packages

Create an environment for deepconsensus using conda

Download the ACTC for reads mapping

Install the Deepconsensus[GPU] by using pip

Prepare all the needed input file for Deepconsensus

Get the ccs.bam

Get the subreads_to_ccs.bam

Tips

Chunking your subreads.bam

Get the model for Deepconsensus

Run the Deepconsensus

Merge the output

DC

Lima

Releases(v1.1.0)

v1.1.0(Dec 16, 2022)

v1.0.0(Oct 11, 2022)

v0.3.1(Jul 19, 2022)

Change Log

v0.3.0(Jul 6, 2022)

Change Log

Acknowledgement

v0.2.0(Jan 18, 2022)

Change Log

v0.1.0(Jan 15, 2022)

Owner

Google

A Blender python script for getting asset browser custom preview images for objects and collections.

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

[NeurIPS 2021] Source code for the paper "Qu-ANTI-zation: Exploiting Neural Network Quantization for Achieving Adversarial Outcomes"

Open source code for the paper of Neural Sparse Voxel Fields.

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Repository For Programmers Seeking a platform to show their skills

Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

salabim - discrete event simulation in Python

DeepFaceLive - Live Deep Fake in python, Real-time face swap for PC streaming or video calls

Turn based roguelike in python

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

Pytorch Implementation of "Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation"

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

An executor that performs image segmentation on fashion items

Personals scripts using ageitgey/face_recognition

Evolution Strategies in PyTorch

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021