πŸ”€ Visual Room Rearrangement

Overview

AI2-THOR Rearrangement Challenge

License Documentation GitHub release

Welcome to the 2021 AI2-THOR Rearrangement Challenge hosted at the CVPR'21 Embodied-AI Workshop. The goal of this challenge is to build a model/agent that move objects in a room to restore them to a given initial configuration. Please follow the instructions below to get started.

If you have any questions please file an issue or post in the #rearrangement-challenge channel on our Ask PRIOR slack.

Contents

πŸ’» Installation

To begin, clone this repository locally

git clone [email protected]:allenai/ai2thor-rearrangement.git
See here for a summary of the most important files/directories in this repository

Here's a quick summary of the most important files/directories in this repository:

  • example.py an example script showing how rearrangement tasks can be instantiated for training and validation.
  • baseline_configs/
    • rearrange_base.py The base configuration file which defines the challenge parameters (e.g. screen size, allowed actions, etc).
    • one_phase/*.py - Baseline experiment configurations for the 1-phase challenge track.
    • two_phase/*.py - Baseline experiment configurations for the 2-phase challenge track.
    • walkthrough/*.py - Baseline experiment configurations if one wants to train the walkthrough phase in isolation.
  • rearrange/
    • baseline_models.py - A collection of baseline models for the 1- and 2-phase challenge tasks. These Actor-Critic models use a CNN->RNN architecture and can be trained using the experiment configs under the baseline_configs/[one/two]_phase/ directories.
    • constants.py - Constants used to define the rearrangement task. These include the step size taken by the agent, the unique id of the the THOR build we use, etc.
    • environment.py - The definition of the RearrangeTHOREnvironment class that wraps the AI2-THOR environment and enables setting up rearrangement tasks.
    • expert.py - The definition of a heuristic expert (GreedyUnshuffleExpert) which uses privileged information (e.g. the scene graph & knowledge of exact object poses) to solve the rearrangement task. This heuristic expert is meant to be used to produce expert actions for use with imitation learning techinques. See the query_expert method within the rearrange.tasks.UnshuffleTask class for an example of how such an action can be generated.
    • losses.py - Losses (outside of those provided by AllenAct by default) used to train our baseline agents.
    • sensors.py - Sensors which provide observations to our agents during training. E.g. the RGBRearrangeSensor obtains RGB images from the environment and returns them for use by the agent.
    • tasks.py - Definitions of the UnshuffleTask, WalkthroughTask, and RearrangeTaskSampler classes. For more information on how these are used, see the Setting up Rearrangement section.
    • utils.py - Standalone utility functions (e.g. computing IoU between 3D bounding boxes).

You can then install requirements by running

pip install -r requirements.txt

or, if you prefer using conda, we can create a thor-rearrange environment with our requirements by running

export MY_ENV_NAME=thor-rearrange
export CONDA_BASE="$(dirname $(dirname "${CONDA_EXE}"))"
export PIP_SRC="${CONDA_BASE}/envs/${MY_ENV_NAME}/pipsrc"
conda env create --file environment.yml --name $MY_ENV_NAME
Why not just run conda env create --file environment.yml --name thor-rearrange by itself?

If you were to run conda env create --file environment.yml --name thor-rearrange nothing would break but we have some pip requirements in our environment.yml file and, by default, these are saved in a local ./src directory. By explicitly specifying the PIP_SRC variable we can have it place these pip-installed packages in a nicer (more hidden) location.

Python 3.6+ 🐍 . Each of the actions supports typing within Python.

AI2-THOR 2.7.2 🧞 . To ensure reproducible results, we're restricting all users to use the exact same version of AI2-THOR.

AllenAct πŸ‹ πŸ’ͺ . We ues the AllenAct reinforcement learning framework for generating baseline models, baseline training pipelines, and for several of their helpful abstractions/utilities.

SciPy πŸ§‘β€πŸ”¬ . We utilize SciPy for evaluation. It helps calculate the IoU between 3D bounding boxes.

πŸ“ Rearrangement Task Description

Object Rearrangement Example

Overview πŸ€– . Our rearrangement task involves moving and modifying (i.e. opening/closing) randomly placed objects within a room to obtain a goal configuration. There are 2 phases:

  1. Walkthrough πŸ‘€ . The agent walks around the room and observes the objects in their ideal goal state.
  2. Unshuffle πŸ‹ . After the walkthrough phase, we randomly change between 1 to 5 objects in the room. The agent's goal is to identify which objects have changed and reset those objects to their state from the walkthrough phase. Changes to an object's state may include changes to its position, orientation, or openness.

πŸ›€οΈ Challenge Tracks and Datasets

☝️ + ✌️ The 1- and 2-Phase Tracks

For this 2021 challenge we have two distinct tracks:

  • 1-Phase Track (Easier). In this track we merge both of the above phases into a single phase. At every step the agent obtains observations from the walkthrough (goal) state as well as the shuffled state. This allows the agent to directly compare aligned images from the two world-states and thus makes it much easier to determine if an object is, or is not, in its goal pose.
  • 2-Phase Track (Harder). In this track, the walkthrough and unshuffle phases occur sequentially and so, once in the unshuffle phase, the agent no longer has any access to the walkthrough state except through any memory it has saved.

πŸ“Š Datasets

For this challenge we have four distinct dataset splits: "train", "train_unseen", "val", and "test". The train and train_unseen splits use floor plans 1-20, 200-220, 300-320, and 400-420 within AI2-THOR, the "val" split uses floor plans 21-25, 221-225, 321-325, and 421-425, and finally the "test" split uses scenes 26-30, 226-230, 326-330, and 426-430. These dataset splits are stored as the compressed pickle-serialized files data/*.pkl.gz. While you are freely (and encouraged) to enhance the training set as you see fit, you should never train your agent within any of the test scenes.

For evaluation, your model will need to be evaluated on each of the above splits and the results submitted to our leaderboard link (see section below). As the "train" and "train_unseen" sets are quite large, we do not expect you to evaluate on their entirety. Instead we select ~1000 datapoints from each of these sets for use in evaluation. For convenience, we provide the data/combined.pkl.gz file which contains the "train", "train_unseen", "val", and "test" datapoints that should be used for evaluation.

Split # Total Episodes # Episodes for Eval Path
train 4000 1200 data/train.pkl.gz
train_unseen 3800 1140 data/train_unseen.pkl.gz
val 1000 1000 data/val.pkl.gz
test 1000 1000 data/test.pkl.gz
combined 4340 4340 data/combined.pkl.gz

πŸ›€οΈ Submitting to the Leaderboard

We are tracking challenge participant entries using the AI2 Leaderboard. The team with the best submission made to either of the below leaderboards by May 31st (midnight, anywhere on earth) will be announced at the CVPR'21 Embodied-AI Workshop and invited to produce a video describing their approach.

Submissions can be made to the 1-phase leaderboard here and submissions to the 2-phase leaderboard can be made here.

Submissions should include your agent's trajectories for all tasks contained within the combined.pkl.gz dataset, this "combined" dataset includes tasks for the train, train_unseen, validation, and test sets. For an example as to how to iterate through all the datapoints in this dataset and save the resulting metrics in our expected submission format see here.

A (full) example the expected submission format for the 1-phase task can be found here and, for the 2-phase task, can be found here. Note that this submission format is a gzip'ed json file where the json file has the form

{
  "UNIQUE_ID_OF_TASK_0": YOUR_AGENTS_METRICS_AND_TRAJECTORY_FOR_TASK_0,
  "UNIQUE_ID_OF_TASK_1": YOUR_AGENTS_METRICS_AND_TRAJECTORY_FOR_TASK_1,
  ...
}

these metrics and unique IDs can be easily obtained when iterating over the dataset (see the above example).

Alternatively: if you run inference on the combined dataset using AllenAct (see below for more details) then you can simply (1) gzip the metrics*.json file saved when running inference, (2) rename this file submission.json.gz, and (3) submit this file to the leaderboard directly.

πŸ–ΌοΈ Allowed Observations

In both of these tracks, agents should make decisions based off of egocentric sensor readings. The types of sensors allowed/provided for this challenge include:

POV Agent Image Depth Agent Image

  1. RGB images - having shape 224x224x3 and an FOV of 90 degrees.
  2. Depth maps - having shape 224x224 and an FOV of 90 degrees.
  3. Perfect egomotion - We allow for agents to know precisely how far (and in which direction) they have moved as well as how many degrees they have rotated.

While you are absolutely free to use any sensor information you would like during training (e.g. pretraining your CNN using semantic segmentations from AI2-THOR or using a scene graph to compute expert actions for imitation learning) such additional sensor information should not be used at inference time.

πŸƒ Allowed Actions

A total of 82 actions are available to our agents, these include:

Navigation

  • Move[Ahead/Left/Right/Back] - Results in the agent moving 0.25m in the specified direction if doing so would not result in the agent colliding with something.

  • Rotate[Right/Left] - Results in the agent rotating 90 degrees clockwise (if Right) or counterclockwise (if Left). This action may fail if the agent is holding an object and rotating would cause the object to collide.

  • Look[Up/Down] - Results in the agent raising or lowering its camera angle by 30 degrees (up to a max of 60 degrees below horizontal and 30 degrees above horizontal).

Object Interaction

  • Pickup[OBJECT_TYPE] - Where OBJECT_TYPE is one of the 62 pickupable object types, see constants.py. This action results in the agent picking up a visible object of type OBJECT_TYPE if: (a) the agent is not already holding an object, (b) the agent is close enough to the object (within 1.5m), and picking up the object would not result in it colliding with objects in front of the agent. If there are multiple objects of type OBJECT_TYPE then the object closest to the agent is chosen.

  • Open[OBJECT_TYPE] - Where OBJECT_TYPE is one of the 10 opennable object types that are not also pickupable, see constants.py. If an object whose openness is different from the openness in the goal state is visible and within 1.5m of the agent, this object's openness is changed to its value in the goal state.

  • PlaceObject - Results in the agent dropping its held object. If the held object's goal state is visible and within 1.5m of the agent, it is placed into that goal state. Otherwise, a heuristic is used to place the object on a nearby surface.

Done action

  • Done - Results in the walkthrough or unshuffle phase immediately terminating.

🍽️ Setting up Rearrangement

✨ Learning by example

See the example.py file for an example of how you can instantiate the 1- and 2-phase variants of our rearrangement task.

🌎 The Rearrange THOR Environment class

The rearrange.environment.RearrangeTHOREnvironment class provides a wrapper around the AI2-THOR environment and is designed to

  1. Make it easy to set up a AI2-THOR scene in a particular state ready for rearrangement.
  2. Provides utilities to make it easy to evaluate (see e.g. the poses and compare_poses methods) how close the current state of the environment is to the goal state.
  3. Provide an API with which the agent may interact with the environment.

πŸ’ The Rearrange Task Sampler class

You'll notice that the above RearrangeTHOREnvironment is not explicitly instantiated by the example.py script and, instead, we create rearrange.tasks.RearrangeTaskSampler objects using the TwoPhaseRGBBaseExperimentConfig.make_sampler_fn and OnePhaseRGBBaseExperimentConfig.make_sampler_fn. This is because the RearrangeTHOREnvironment is very flexible and doesn't know anything about training/validation/test datasets, the types of actions we want our agent to be restricted to use, or precisely which types of sensor observations we want to give our agents (e.g. RGB images, depth maps, etc). All of these extra details are managed by the RearrangeTaskSampler which iteratively creates new tasks for our agent to complete when calling the next_task method. During training, these new tasks can be sampled indefinitely while, during validation or testing, the tasks will only be sampled until the validation/test datasets are exhausted. This sampling is best understood by example so please go over the example.py file.

🚢 πŸ”€ The Walkthrough Task and Unshuffle Task classes

As described above, the RearrangeTaskSampler samples tasks for our agent to complete, these tasks correspond to instantiations of the rearrange.tasks.WalkthroughTask and rearrange.tasks.UnshuffleTask classes. For the 2-phase challenge track, the RearrangeTaskSampler will first sample a new WalkthroughTask after which it will sample a corresponding UnshuffleTask where the agent must return the objects to their poses at the start of the WalkthroughTask.

πŸ—ΊοΈ Object Poses

Accessing object poses 🧘 . The poses of all objects in the environment can be accessed using the RearrangeTHOREnvironment.poses property, i.e.

unshuffle_start_poses, walkthrough_start_poses, current_poses = env.poses # where env is an RearrangeTHOREnvironment instance  

Reading an object's pose πŸ“– . Here, unshuffle_start_poses, walkthrough_start_poses, and current_poses evaluate to a list of dictionaries and are defined as:

  • unshuffle_start_poses stores a list of object poses if the agent were to do nothing to the env during the unshuffling phase.
  • walkthrough_start_poses stores a list of object poses that the agent sees during the walkthrough phase.
  • current_poses stores a list of object poses in the current state of the environment (i.e. possibly after the unshuffle agent makes all its changes to the env during the unshuffling phase).

Each dictionary contains the object's pose in a form similar to:

{
    "type": "Candle",
    "position": {
        "x": -0.3012670874595642,
        "y": 0.7431036233901978,
        "z": -2.040205240249634
    },
    "rotation": {
        "x": 2.958569288253784,
        "y": 0.027708930894732475,
        "z": 0.6745457053184509
    },
    "openness": None,
    "pickupable": True,
    "broken": False,
    "objectId": "Candle|-00.30|+00.74|-02.04",
    "name": "Candle_977f7f43",
    "parentReceptacles": [
        "Bathtub|-01.28|+00.28|-02.53"
    ],
    "bounding_box": [
        [-0.27043721079826355, 0.6975823640823364, -2.0129783153533936],
        [-0.3310248851776123, 0.696869969367981, -2.012985944747925],
        [-0.3310534358024597, 0.6999208927154541, -2.072017192840576],
        [-0.27046576142311096, 0.7006332278251648, -2.072009563446045],
        [-0.272365003824234, 0.8614493608474731, -2.0045082569122314],
        [-0.3329526484012604, 0.8607369661331177, -2.0045158863067627],
        [-0.3329811990261078, 0.8637878894805908, -2.063547134399414],
        [-0.27239352464675903, 0.8645002245903015, -2.063539505004883]
    ]
}

Matching objects across poses 🀝 . Across unshuffle_start_poses, walkthrough_start_poses, and current_poses, the ith entry in each list will always correspond to the same object across each pose list. So, unshuffle_start_poses[5] will refer to the same object as walkthrough_start_poses[5] and current_poses[5]. Most scenes have around 70 objects, among which, 10 to 20 are pickupable by the agent.

Pose keys πŸ”‘ .

  • openness specifies the [0:1] percentage that an object is opened. For objects where the openness value does not fit (e.g., Bowl, Spoon), the openness value is None.
  • bounding_box is only given for moveable objects, where the set of moveable objects may consist of couches or chairs, that are not necessarily pickupable. For pickupable objects, the bounding_box is aligned to the object's relative axes. For moveable objects that are non-pickupable, the object is aligned to the global axes.
  • broken states if the object broke from the agent's actions during the unshuffling phase. The initial pose or goal pose for each object will never be broken. But, if the agent decides to pick up an object, and drop it on a hard surface, it's possible that the object can break.

πŸ† Evaluation

To evaluate the quality of a rearrangement agent we compute several metrics measuring how well the agent has managed to move objects so that their final poses are (approximately) equal to their goal poses.

πŸ“ When are poses (approximately) equal?

Recall that we represent the pose of an object as a combination of its:

  1. Openness πŸ“– . - A value in [0,1] which measures how far the object has been opened.
  2. Position πŸ“ , Rotation πŸ™ƒ , and bounding box πŸ“¦ - The 3D position, rotation, and bounding box of each object.
  3. Broken - A boolean indicating if the object has been broken (all goal object poses are unbroken).

The openness between its goal state and predicted state is off by less than 20 percent. The openness check is only applied to objects that can open. The object's 3D bounding box from its goal pose and the predicted pose must have an IoU over 0.5. The positional check is only relevant to objects that can move.

To measure if two object poses are approximately equal we use the following criterion:

  1. ❌ If any object pose is broken.
  2. ❌ If the object is opennable but not pickupable (e.g. a cabinet) and the the openness values between the two poses differ by more than 0.2.
  3. ❌ The two 3D bounding boxes of pickupable objects have an IoU under 0.5.
  4. βœ”οΈ None of the above criteria are met so the poses are not broken, are close in openness values, and have sufficiently high IoU.

πŸ’― Computing metrics

Suppose that task is an instance of an UnshuffleTask which your agent has taken actions until reaching a terminal state (e.g. either the agent has taken the maximum number of steps or it has taken the "done" action). Then metrics regarding the agent's performance can be computed by calling the task.metrics() function. This will return a dictionary of the form

{
    "task_info": {
        "scene": "FloorPlan420",
        "index": 7,
        "stage": "train"
    },
    "ep_length": 176,
    "unshuffle/ep_length": 7,
    "unshuffle/reward": 0.5058389582634852,
    "unshuffle/start_energy": 0.5058389582634852,
    "unshuffle/end_energy": 0.0,
    "unshuffle/prop_fixed": 1.0,
    "unshuffle/prop_fixed_strict": 1.0,
    "unshuffle/num_misplaced": 0,
    "unshuffle/num_newly_misplaced": 0,
    "unshuffle/num_initially_misplaced": 1,
    "unshuffle/num_fixed": 1,
    "unshuffle/num_broken": 0,
    "unshuffle/change_energy": 0.5058464936498058,
    "unshuffle/num_changed": 1,
    "unshuffle/prop_misplaced": 0.0,
    "unshuffle/energy_prop": 0.0,
    "unshuffle/success": 0.0,
    "walkthrough/ep_length": 169,
    "walkthrough/reward": 1.82,
    "walkthrough/num_explored_xz": 17,
    "walkthrough/num_explored_xzr": 46,
    "walkthrough/prop_visited_xz": 0.5151515151515151,
    "walkthrough/prop_visited_xzr": 0.3484848484848485,
    "walkthrough/num_obj_seen": 11,
    "walkthrough/prop_obj_seen": 0.9166666666666666
}

Of the above metrics, the most important (those used for comparing models) are

  • Success rate ("unshuffle/success") - This is the most unforgiving of our metrics and equals 1 if all object poses are in their goal states after the unshuffle phase.
  • % Misplaced ("unshuffle/prop_misplaced") - The above sucess metric is quite strict, requiring exact rearrangement of all objects, and also does not additionally penalize an agent for moving objects that should not be moved. This metric equals the number of misplaced objects at the end of the episode divided by the number of misplaced objects at the start of the episode. Note that this metric can be larger than 1 if the agent, during the unshuffle stage, misplaces more objects than were originally misplaced at the start.
  • % Fixed Strict ("unshuffle/prop_fixed_strict") - This metric equals 0 if, at the end of the unshuffle task, the agent has misplaced any new objects (i.e. it has incorrectly moved an object that started in its correct position). Otherwise, if it has not misplaced new objects, then this equals (# objects which started in the wrong pose but are now in the correct pose) / (# objects which started in an incorrect pose), i.e. the proportion of objects who had their pose fixed.
  • % Energy Remaining ("unshuffle/energy_prop") - The above metrics do not give any partial credit if, for example, the agent moves an object across a room and towards its goal pose but fails to place it so that has a sufficiently high IOU with the goal. To allow for partial credit, we define an energy function D that monotonically decreases to 0 as two poses get closer together (see code for full details) and which equals zero if two poses are approximately equal. This metric is then defined as the amount of energy remaining at the end of the unshuffle episode divided by the total energy at the start of the unshuffle episode, i.e. equals (sum of energy between all goal/current object poses at end of the unshuffle phase) / (sum of energy between all goal/current object poses at the start of the unshuffle phase).

πŸ‹ Training Baseline Models with AllenAct

We use the AllenAct framework for training our baseline rearrangement models, this framework is automatically installed when installing the requirements for this project.

Before running training or inference you'll first have to add the ai2thor-rearrangement directory to your PYTHONPATH (so that python and AllenAct knows where to for various modules). To do this you can run the following:

cd YOUR/PATH/TO/ai2thor-rearrangement
export PYTHONPATH=$PYTHONPATH:$PWD

Let's say you want to train a model for the 1-phase challenge. This can be easily done by running the command

allenact -o rearrange_out -b . baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py 

This will train (using DAgger, a form of imitation learning) a model which uses a pretrained (with frozen weights) ResNet18 as the visual backbone that feeds into a recurrent neural network (a GRU) before producing action probabilities and a value estimate. Results from this training are then saved to rearrange_out where you can find model checkpoints, tensorboard plots, and configuration files that can be used if you, in the future, forget precisely what the details of your experiment were.

A similar model can be trained for the 2-phase challenge by running

allenact -o rearrange_out -b . baseline_configs/two_phase/two_phase_rgb_resnet_ppowalkthrough_ilunshuffle.py

πŸ’ͺ Pretrained Models

We currently provide the following pretrained models (see our paper for details on these models):

Model % Fixed Strict (Test) Pretrained Model
1-Phase ResNet18+ANM IL 8.9% (link)
1-Phase ResNet18 IL 6.3% (link)
1-Phase ResNet18 PPO 5.3% (link)
1-Phase Simple IL 4.8% (link)
1-Phase Simple PPO 4.6% (link)
2-Phase ResNet18+ANM IL+PPO 1.44% (link)
2-Phase ResNet18 IL+PPO 0.66% (link)

These models can be downloaded at from the above links and should be placed into the pretrained_model_ckpts directory. You can then, for example, run inference for the 1-Phase ResNet18 IL model using AllenAct by running:

export CURRENT_TIME=$(date '+%Y-%m-%d_%H-%M-%S') # This is just to record when you ran this inference
allenact baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py \
-c pretrained_model_ckpts/exp_OnePhaseRGBResNetDagger_40proc__stage_00__steps_000050058550.pt \
--extra_tag $CURRENT_TIME \
--eval

this will evaluate this model across all datapoints in the data/combined.pkl.gz dataset which contains data from the train, train_unseen, val, and test sets so that evaluation doesn't have to be run on each set separately.

πŸ“„ Citation

If you use this work, please cite our paper (to appear in CVPR'21):

@InProceedings{RoomR,
  author = {Luca Weihs and Matt Deitke and Aniruddha Kembhavi and Roozbeh Mottaghi},
  title = {Visual Room Rearrangement},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2021}
}
Comments
  • Inference using AllenAct gets stuck at some point

    Inference using AllenAct gets stuck at some point

    Executing the "Pretrained Phase-1" segment of the README gets stuck at some point.

    This is the exact command. allenact baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py -c pretrained_model_ckpts/exp_OnePhaseRGBResNetDagger_40proc__time_2021-02-07_11-25-27__stage_00__steps_000075001830.pt -t 2021-02-07_11-25-27

    I have attached the screenshot of the stopping point. capture1

    I am using Ubuntu virtual machine (VMware). It isn't connected to the GPU, so maybe that is the issue? Since ai2thor needs to start the gui for the visualization of the scenes?

    Is it possible to use the environment and train the models without the gui? If that is the case then maybe it would be possible to run scripts by using Windows Subsystem for Linux? (wsl2)

    opened by senadkurtisi 15
  • Visulization and environments info

    Visulization and environments info

    Thanks for your awesome work.

    I am really interested in this work and I am wondering whether you provide a visulization of the third perspective of the agent?

    I am trying to gather some data to train my pretrained model. Could you provide an example or instructions about gathering the groundtruth of the object during the interaction with the environment, like the class and the position(6D pose in global coordinate or relative coordinate) of the object?

    opened by phj128 13
  • xdpyinfo:  unable to open display

    xdpyinfo: unable to open display ":0.1".

    Hi,

    I am facing an issue while trying to run the baseline models allenact -o rearrange_out -b . baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py.

    xdpyinfo:  unable to open display ":0.1".
    Process ForkServerProcess-2:1:
    Traceback (most recent call last):
    .
    .
    .
    AssertionError: Invalid DISPLAY :0.1 - cannot find X server with xdpyinfo
    04/01 17:30:22 ERROR: Encountered Exception. Terminating train worker 1	[engine.py: 1319]
    

    Any suggestions to solve this? I can run python example.py successfully though.

    opened by nnsriram97 11
  • ValueError: Invalid commit id: xxxxxxx  - no build exists for arch=Linux

    ValueError: Invalid commit id: xxxxxxx - no build exists for arch=Linux

    Hello, I am new to the embodied AI area and when I tried to run the baseline model training. Following errors occurred and I really do not know what has happened. Is there anybody that could provide any clues on what may cause this. I would be very appreciated for this!

    [08/05 14:40:07 INFO:] Starting 19-th VectorSampledTask worker with args [{'force_cache_reset': False, 'epochs': inf, 'stage': 'train', 'allowed_scenes': ['FloorPlan419', 'FloorPlan420'], 'scene_to_allowed_rearrange_inds': None, 'seed': 151437334310827783848716556864843341311, 'x_display': '0.0', 'sensors': [<rearrange.sensors.RGBRearrangeSensor object at 0x7f67ac88fdd8>, <rearrange.sensors.UnshuffledRGBRearrangeSensor object at 0x7f67ac88ff98>, <allenact.base_abstractions.sensor.ExpertActionSensor object at 0x7f67ac8a11d0>], 'mp_ctx': <multiprocessing.context.ForkServerContext object at 0x7f67b5608c50>}]        [vector_sampled_tasks.py: 380]
    Process ForkServerProcess-1:19:
    Traceback (most recent call last):
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
        self.run()
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/multiprocessing/process.py", line 93, in run
        self._target(*self._args, **self._kwargs)
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 282, in _task_sampling_loop_worker
        should_log=should_log,
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 821, in __init__
        sampler_fn_args=[{"mp_ctx": None, **args} for args in sampler_fn_args_list],
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 1026, in _create_generators
        if next(generators[-1]) != "started":
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 881, in _task_sampling_loop_generator_fn
        task_sampler = make_sampler_fn(**sampler_fn_args)
      File "/home/lishuzhao/ai2thor-rearrangement/baseline_configs/one_phase/one_phase_rgb_base.py", line 81, in make_sampler_fn
        **kwargs,
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/tasks.py", line 877, in from_fixed_dataset
        **init_kwargs,
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/tasks.py", line 828, in __init__
        self.walkthrough_env = RearrangeTHOREnvironment(**rearrange_env_kwargs)
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/environment.py", line 245, in __init__
        self.controller = self.create_controller()
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/environment.py", line 264, in create_controller
        **self._controller_kwargs,
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/ai2thor/controller.py", line 465, in __init__
        self._build = self.find_build(local_build, commit_id, branch)
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/ai2thor/controller.py", line 1118, in find_build
        % (commit_id, platform.system())
    ValueError: Invalid commit_id: f46d5ec42b65fdae9d9a48db2b4fb6d25afbd1fe - no build exists for arch=Linux
    Process ForkServerProcess-2:19:
    Traceback (most recent call last):
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
        self.run()
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/multiprocessing/process.py", line 93, in run
        self._target(*self._args, **self._kwargs)
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 282, in _task_sampling_loop_worker
        should_log=should_log,
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 821, in __init__
        sampler_fn_args=[{"mp_ctx": None, **args} for args in sampler_fn_args_list],
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 1026, in _create_generators
        if next(generators[-1]) != "started":
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 881, in _task_sampling_loop_generator_fn
        task_sampler = make_sampler_fn(**sampler_fn_args)
      File "/home/lishuzhao/ai2thor-rearrangement/baseline_configs/one_phase/one_phase_rgb_base.py", line 81, in make_sampler_fn
        **kwargs,
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/tasks.py", line 877, in from_fixed_dataset
        **init_kwargs,
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/tasks.py", line 828, in __init__
        self.walkthrough_env = RearrangeTHOREnvironment(**rearrange_env_kwargs)
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/environment.py", line 245, in __init__
        self.controller = self.create_controller()
      File "/home/lishuzhao/ai2thor-rearrangement/rearrange/environment.py", line 264, in create_controller
        **self._controller_kwargs,
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/ai2thor/controller.py", line 465, in __init__
        self._build = self.find_build(local_build, commit_id, branch)
      File "/root/anaconda3/envs/ai2thor-rearrange/lib/python3.6/site-packages/ai2thor/controller.py", line 1118, in find_build
        % (commit_id, platform.system())
    ValueError: Invalid commit_id: f46d5ec42b65fdae9d9a48db2b4fb6d25afbd1fe - no build exists for arch=Linux
    
    opened by Leeeshuz 10
  • Cloud Rendering Support

    Cloud Rendering Support

    Hello ai2thor-rearrangement authors,

    I am using AI2-THOR in a setting where I will not be able to use an X server for rendering. It looks like support for rendering using Vulkan was recently added to AI2-THOR in one of the recent releases (3.5.0, I think), via the CloudRendering option in the Controller object. However, it looks like only a specific commit ID for the Unity engine currently supports Vulkan with AI2-THOR.

    https://github.com/allenai/ai2thor/blob/db856525b770e0ff5af38e9efa27ac0073221be3/ai2thor/build.py#L35

    But, the commit ID is set to something different in the rearrangement challenge:

    https://github.com/allenai/ai2thor-rearrangement/blob/main/rearrange/constants.py#L12

    I find that if I set the commit ID to the one that supports Vulkan rendering, I see the following error when attempting to rotate the agent. If I understand the error, it looks like this version of Unity doesn't support the actionSimulationSeconds argument, which is used by the RearrangeTHOREnvironment class.

    File ".../allenact/allenact/base_abstractions/task.py", line 124, in step
        sr = self._step(action=action)
      File ".../ai2thor-rearrangement/rearrange/tasks.py", line 636, in _step
        action_success = getattr(self.walkthrough_env, action_name)()
      File ".../ai2thor-rearrangement/rearrange/environment.py", line 581, in rotate_right
        return execute_action(
      File ".../ai2thor-rearrangement/rearrange/utils.py", line 250, in execute_action
        event = controller.step(thor_action, **kwargs)
      File ".../ai2thor/controller.py", line 960, in step
        raise ValueError(self.last_event.metadata["errorMessage"])
    ValueError: 
            Action: "RotateRight" called with invalid argument: 'actionSimulationSeconds'
            Expected arguments: Nullable`1 degrees = , Boolean manualInteract = False, Boolean forceAction = False, Single speed = 1, Boolean waitForFixedUpdate = False, Boolean returnToStart = True, Boolean disableRendering = True, Single fixedDeltaTime = 0.02
            Your arguments: 'actionSimulationSeconds', 'fixedDeltaTime'
            Valid ways to call "RotateRight" action:
                    Void RotateRight(Nullable`1 degrees = , Boolean manualInteract = False, Boolean forceAction = False, Single speed = 1, Boolean waitForFixedUpdate = False, Boolean returnToStart = True, Boolean disableRendering = True, Single fixedDeltaTime = 0.02)
    

    If I comment out all instances of the actionSimulationSeconds, the issue goes away and the environment runs without error. However, I'm not sure what effect its removal will have on subsequent agent evaluations I will perform.

    Is this a reasonable change for me to make in order to support Vulkan?

    Thanks! Brandon

    opened by brandontrabucco 6
  • When I ran the Room Rearrangement task experiment, EOFerror appeared

    When I ran the Room Rearrangement task experiment, EOFerror appeared

    When I run the following command, the error in the screenshot appears。 β€˜allenact -o rearrange_out -b . baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py ’ The error occurred after the program had been running for some time There is a GPU on my computer image Please add the following information: OS: Ubuntu 9.3.0-17ubuntu1~20.04 Allenact: 0.4.o Allenact-plugins: 0.4.0 GPU: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti]

    opened by twb1235 4
  • ModuleNotFoundError: No module named 'baseline_configs'

    ModuleNotFoundError: No module named 'baseline_configs'

    I'm trying to run training code for the provided baseline model. allenact -o rearrange_out -b . baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py

    But I face this error. Any idea why this occurs even after providing allenact -b ./ (which contains baseline_configs)?

    03/23 21:25:23 ERROR: Uncaught exception:	[system.py: 140]
    Traceback (most recent call last):
    ...
      File "/path/to/anaconda3/envs/ai2thor/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
    ModuleNotFoundError: No module named 'baseline_configs'
    
    opened by nnsriram97 4
  • Could you open-source the data-generation code?

    Could you open-source the data-generation code?

    Hi, I'm really interested in this work, and I want to generate my own rearrange-tasks for research. Currently it seems that the rearrange-tasks can only be loaded from fixed datasets(e.g. train, test, val, combined). Could you open-source the data-generation code? Really appreciate that~

    opened by AaronAnima 3
  • Agent not placing the objects in the correct pose

    Agent not placing the objects in the correct pose

    I see sometimes that even when the goal state is visible, the agent doesn't seem to place the object in the right place and there's some difference in the pose of the placed object with that of the goal. This can be noticed from the first example of the expert from valid dataset.

    Here's a video from the expert where it is trying to rearrange the toilet paper but is unsuccessful in placing it at the right place and hence keeps retrying.

    https://user-images.githubusercontent.com/11937559/120375753-00f56380-c2d0-11eb-89f1-b00167b23941.mp4

    opened by nnsriram97 3
  • Leaderboard Question

    Leaderboard Question

    It says on the Leaderboard website that "To protect against overfitting on the blind test dataset you can only publish once every 7 days." Does that mean we can only submit once since there are less than 7 days left? I suppose there is no blind test dataset in the rearrangement task, so can we submit for many times? Thanks!

    opened by lan-lyu 3
  • Is depth available?

    Is depth available?

    Hi, I wonder can we use depth sensor for the rearrangement task? It wasn't mentioned in the paper, but I found there is a DepthRearrangeSensor class in sensors.py. Wondering if this information is available.

    Thanks

    opened by zhangfuyang 3
  • rearrangement benchmark (2022) openness

    rearrangement benchmark (2022) openness

    I am opening this to discuss a few items related to the most recent rearrangement benchmark (2022):

    1. We noticed that when executing the β€œopen” command in the unshuffle phase on an object that has not changed in openness from the walkthrough phase, the action has no affect and the metrics (e.g. Fixed Strict, Success, etc.) do not account for the attempted change in state.
    2. Sometimes the openness of an object (particularly drawers) between the walkthrough and unshuffle phase is not noticeably changed, but the object shows up as changed when accessing the object's meta data (openness level between the phases & start energy). I attached an image of a drawer in the walkthrough and unshuffle phases that was tagged as changed by the start energy.
    3. When executing pickup {object}, sometimes the wrong object instance is picked up if two objects of the same category are within view. For example, in the image below, the target is the toilet paper roll on the rack, but the roll on the sink is picked up instead. However, this occurs under rare circumstances.

    Thank you for all the great developing tools!

    Pasted Graphic 2 Pasted Graphic
    opened by Gabesarch 0
Releases(v0.5.1)
  • v0.5.1(Mar 25, 2022)

    This release adds:

    1. Links to the 2022 leaderboards.
    2. Pretrained model and experiment configuration file for the embodied CLIP model that is SOTA for the 1-phase 2021 leaderboard. Relevant paper: https://arxiv.org/abs/2111.09888 .
    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Feb 15, 2022)

    πŸ”₯πŸ†•πŸ”₯ 2022 AI2THOR-Rearrangement Challenge

    Our 2022 AI2-THOR Rearrangement Challenge has several upgrades distinguishing it from the 2021 version:

    1. New AI2-THOR version. We've upgraded the version of AI2-THOR we're using from 2.1.0 to 4.1.0, this brings:
      • Performance improvements
      • The ability to use (the recently announced) headless rendering feature, see here this makes it much easier to run AI2-THOR on shared servers where you may not have the admin privileges to start an X-server.
    2. New dataset. We've released a new rearrangement dataset to match the new AI2-THOR version. This new dataset has a more uniform balance of easy/hard episodes.
    3. Misc. improvements. We've fixed a number of minor bugs and performance issues from the 2021 challenge improving consistency.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(May 17, 2021)

    This patch updates the AI2-THOR commit version in order to improves the quality of the data returned by the semantic mapping sensors. This fixes a bug where semantic map sensors might return incorrect convex hulls for "Drawer" objects.

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Apr 26, 2021)

    This release includes three new experiments corresponding to the semantic mapping experiments from the paper:

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Apr 9, 2021)

    This release includes two central updates:

    1. New instructions and links to our online challenge leaderboard along with instructions describing how to create submissions.
    2. Use of a new AI2-THOR build that improves physics determinism (necessary for making leaderboard evaluation repeatable).
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Feb 17, 2021)

    This release includes brings this repository up to date for our 2021 challenge. This includes several substantial upgrades including:

    1. Definitions of new challenge tracks.
    2. Integration of our rearrangement environment into AllenAct abstractions.
    3. New baseline models with associated pretrained models.
    4. Large amounts of documentation.
    5. New training, validation, and test sets.
    Source code(tar.gz)
    Source code(zip)
ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'

ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'

Yanda Meng 14 May 13, 2022
CIFAR-10_train-test - training and testing codes for dataset CIFAR-10

CIFAR-10_train-test - training and testing codes for dataset CIFAR-10

Frederick Wang 3 Apr 26, 2022
HyperCube: Implicit Field Representations of Voxelized 3D Models

HyperCube: Implicit Field Representations of Voxelized 3D Models Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzcinski, PrzemysΕ‚aw Spurek [Pap

Magdalena Proszewska 3 Mar 09, 2022
Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

[AAAI2022] UCTransNet This repo is the official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspectiv

Haonan Wang 199 Jan 03, 2023
PFFDTD is an open-source FDTD simulator for 3D room acoustics

PFFDTD is an open-source FDTD simulator for 3D room acoustics

Brian Hamilton 34 Nov 24, 2022
This is an official implementation for "Self-Supervised Learning with Swin Transformers".

Self-Supervised Learning with Vision Transformers By Zhenda Xie*, Yutong Lin*, Zhuliang Yao, Zheng Zhang, Qi Dai, Yue Cao and Han Hu This repo is the

Swin Transformer 529 Jan 02, 2023
Unimodal Face Classification with Multimodal Training

Unimodal Face Classification with Multimodal Training This is a PyTorch implementation of the following paper: Unimodal Face Classification with Multi

Wenbin Teng 3 Jul 06, 2022
Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

Chris Rockwell 95 Nov 22, 2022
Optimizaciones incrementales al problema N-Body con el fin de evaluar y comparar las prestaciones de los traductores de Python en el Γ‘mbito de HPC.

Python HPC Optimizaciones incrementales de N-Body (all-pairs) con el fin de evaluar y comparar las prestaciones de los traductores de Python en el Γ‘mb

AndrΓ©s Milla 12 Aug 04, 2022
PyTorch for Semantic Segmentation

PyTorch for Semantic Segmentation This repository contains some models for semantic segmentation and the pipeline of training and testing models, impl

Zijun Deng 1.7k Jan 06, 2023
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Microsoft 1.3k Dec 30, 2022
FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

FedJAX: Federated learning with JAX What is FedJAX? FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX. FedJAX priori

Google 208 Dec 14, 2022
PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

Yang Li 12 May 30, 2022
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat

338 Dec 27, 2022
M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images

M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images This repo is the official implementation of paper "M2MRF: Man

12 Dec 14, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
FedML: A Research Library and Benchmark for Federated Machine Learning

FedML: A Research Library and Benchmark for Federated Machine Learning πŸ“„ https://arxiv.org/abs/2007.13518 News 2021-02-01 (Award): #NeurIPS 2020# Fed

FedML-AI 2.3k Jan 08, 2023
PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Soft DTW Loss Function for PyTorch in CUDA This is a Pytorch Implementation of Soft-DTW: a Differentiable Loss Function for Time-Series which is batch

Keon Lee 76 Dec 20, 2022
A library to inspect itermediate layers of PyTorch models.

A library to inspect itermediate layers of PyTorch models. Why? It's often the case that we want to inspect intermediate layers of a model without mod

archinet.ai 380 Dec 28, 2022