![]() You can also choose to use multiple machines in a cluster to further speedup training, often necessary for production level loads. You can easily switch between many different machines setup for you, including powerful GPU machines that give a big speedup. You do not have to worry about setting up your machines with the RL toolkits and deep learning frameworks. Using Amazon SageMaker RL Īmazon SageMaker RL allows you to train your RL agents in cloud machines using docker containers. There are a couple of other classes that provide an easier ( KnapSackEnv) and a more difficult version ( KnapSackHardEnv) of this problem. You can see the specifics in the KnapSackMediumEnv class in knapsack_env.py. Value and weight of items that will come in the future, and the bag can only hold so many items, it is not obvious what is the right thing to do.Īt each time step, our agent is aware of the following information: - Weight capacity of the bag - Volume capacity of the bag - Sum of item weight in the bag - Sum of item volume in the bag - Sum of item value in the bag - Current item weight - Current item volume - Current item value - Time remainingĪt each time step, our agent can take one of the following actions: - Put the item in the bag - Throw the item awayĪt each time step, our agent gets the following reward depending on their action: - Item value if you put it in the bag and bag does not overflow - A penalty if you throw the item away or if the item does not fit in the bag This process repeats for a fixed number of steps. ![]() In the next step, another item appears and we need to decide again if we want to put it in the bag or throw it away. In case the bag is too full to accommodate the item, we are forced to throw it away. If we throw the item away, we get a fixed penalty. If we put it in the bag, we get a reward equal to the value of the item. We need to either put the item in the bag or throw it away. Items one at a time over a fixed time horizon. But in this baseline, we instead consider the In the classic version of the problem, we pick the items in one shot. The problem is hard because the items have different values and weights, and there are many combinations to consider. Our objective is to maximize the value of the items in the bag but we cannot put all the items in as the bag capacity is limited. Knapsack is a canonical operations research problem. Solving Knapsack Problem with Amazon SageMaker RL
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |