You are here: Home Proceedings
Document Actions

Citation and metadata

Recommended citation

Wesselhöft M, Braun P, Kreutzfeldt J (2023). Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse. Logistics Journal : Proceedings, Vol. 2023. (urn:nbn:de:0009-14-58257)

Download Citation

Endnote

%0 Journal Article
%T Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse
%A Wesselhöft, Mike
%A Braun, Philipp
%A Kreutzfeldt, Jochen
%J Logistics Journal : Proceedings
%D 2023
%V 2023
%N 1
%@ 2192-9084
%F wesselhöft2023
%X With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.
%L 620
%K Autonome Roboter
%K Künstliche Intelligenz
%K Logistik 4.0
%K Reinforcement Learning
%K Robotik
%K artificial intelligence
%K autonomous mobile robots
%K logistics 4.0
%K robotics
%R 10.2195/lj_proc_wesselhoeft_en_202310_01
%U http://nbn-resolving.de/urn:nbn:de:0009-14-58257
%U http://dx.doi.org/10.2195/lj_proc_wesselhoeft_en_202310_01

Download

Bibtex

@Article{wesselhöft2023,
  author = 	"Wesselh{\"o}ft, Mike
		and Braun, Philipp
		and Kreutzfeldt, Jochen",
  title = 	"Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse",
  journal = 	"Logistics Journal : Proceedings",
  year = 	"2023",
  volume = 	"2023",
  number = 	"1",
  keywords = 	"Autonome Roboter; K{\"u}nstliche Intelligenz; Logistik 4.0; Reinforcement Learning; Robotik; artificial intelligence; autonomous mobile robots; logistics 4.0; robotics",
  abstract = 	"With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.",
  issn = 	"2192-9084",
  doi = 	"10.2195/lj_proc_wesselhoeft_en_202310_01",
  url = 	"http://nbn-resolving.de/urn:nbn:de:0009-14-58257"
}

Download

RIS

TY  - JOUR
AU  - Wesselhöft, Mike
AU  - Braun, Philipp
AU  - Kreutzfeldt, Jochen
PY  - 2023
DA  - 2023//
TI  - Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse
JO  - Logistics Journal : Proceedings
VL  - 2023
IS  - 1
KW  - Autonome Roboter
KW  - Künstliche Intelligenz
KW  - Logistik 4.0
KW  - Reinforcement Learning
KW  - Robotik
KW  - artificial intelligence
KW  - autonomous mobile robots
KW  - logistics 4.0
KW  - robotics
AB  - With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.
SN  - 2192-9084
UR  - http://nbn-resolving.de/urn:nbn:de:0009-14-58257
DO  - 10.2195/lj_proc_wesselhoeft_en_202310_01
ID  - wesselhöft2023
ER  - 
Download

Wordbib

<?xml version="1.0" encoding="UTF-8"?>
<b:Sources SelectedStyle="" xmlns:b="http://schemas.openxmlformats.org/officeDocument/2006/bibliography"  xmlns="http://schemas.openxmlformats.org/officeDocument/2006/bibliography" >
<b:Source>
<b:Tag>wesselhöft2023</b:Tag>
<b:SourceType>ArticleInAPeriodical</b:SourceType>
<b:Year>2023</b:Year>
<b:PeriodicalTitle>Logistics Journal : Proceedings</b:PeriodicalTitle>
<b:Volume>2023</b:Volume>
<b:Issue>1</b:Issue>
<b:Url>http://nbn-resolving.de/urn:nbn:de:0009-14-58257</b:Url>
<b:Url>http://dx.doi.org/10.2195/lj_proc_wesselhoeft_en_202310_01</b:Url>
<b:Author>
<b:Author><b:NameList>
<b:Person><b:Last>Wesselhöft</b:Last><b:First>Mike</b:First></b:Person>
<b:Person><b:Last>Braun</b:Last><b:First>Philipp</b:First></b:Person>
<b:Person><b:Last>Kreutzfeldt</b:Last><b:First>Jochen</b:First></b:Person>
</b:NameList></b:Author>
</b:Author>
<b:Title>Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse</b:Title>
<b:Comments>With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.</b:Comments>
</b:Source>
</b:Sources>
Download

ISI

PT Journal
AU Wesselhöft, M
   Braun, P
   Kreutzfeldt, J
TI Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse
SO Logistics Journal : Proceedings
PY 2023
VL 2023
IS 1
DI 10.2195/lj_proc_wesselhoeft_en_202310_01
DE Autonome Roboter; Künstliche Intelligenz; Logistik 4.0; Reinforcement Learning; Robotik; artificial intelligence; autonomous mobile robots; logistics 4.0; robotics
AB With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.
ER

Download

Mods

<mods>
  <titleInfo>
    <title>Comparing Continuous Single-Agent Reinforcement Learning Controls in a Simulated Logistic Environment using NVIDIA Omniverse</title>
  </titleInfo>
  <name type="personal">
    <namePart type="family">Wesselhöft</namePart>
    <namePart type="given">Mike</namePart>
  </name>
  <name type="personal">
    <namePart type="family">Braun</namePart>
    <namePart type="given">Philipp</namePart>
  </name>
  <name type="personal">
    <namePart type="family">Kreutzfeldt</namePart>
    <namePart type="given">Jochen</namePart>
  </name>
  <abstract>With the transition to Logistics 4.0, the increasing demand for autonomous mobile robots (AMR) in logistics has amplified the complexity of fleet control in dynamic environments. Reinforcement learning (RL), particularly decentralized RL algorithms, has emerged as a potential solution given its ability to learn in uncertain terrains. While discrete RL structures have shown merit, their adaptability in logistics remains questionable due to their inherent limitations. This paper presents a comparative analysis of continuous RL algorithms - Advantage Actor-Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO) - in the context of controlling a Turtlebot3 within a warehouse scenario. Our findings reveal A2C as the frontrunner in terms of success rate and training time, while DDPG excels step minimization while PPO distinguishes itself primarily through its relatively short training duration. This study underscores the potential of continuous RL algorithms, especially A2C, in the future of AMR fleet management in logistics. Significant work remains to be done, particularly in the area of algorithmic fine-tuning.</abstract>
  <subject>
    <topic>Autonome Roboter</topic>
    <topic>Künstliche Intelligenz</topic>
    <topic>Logistik 4.0</topic>
    <topic>Reinforcement Learning</topic>
    <topic>Robotik</topic>
    <topic>artificial intelligence</topic>
    <topic>autonomous mobile robots</topic>
    <topic>logistics 4.0</topic>
    <topic>reinforcement learning</topic>
    <topic>robotics</topic>
  </subject>
  <classification authority="ddc">620</classification>
  <relatedItem type="host">
    <genre authority="marcgt">periodical</genre>
    <genre>academic journal</genre>
    <titleInfo>
      <title>Logistics Journal : Proceedings</title>
    </titleInfo>
    <part>
      <detail type="volume">
        <number>2023</number>
      </detail>
      <detail type="issue">
        <number>1</number>
      </detail>
      <date>2023</date>
    </part>
  </relatedItem>
  <identifier type="issn">2192-9084</identifier>
  <identifier type="urn">urn:nbn:de:0009-14-58257</identifier>
  <identifier type="doi">10.2195/lj_proc_wesselhoeft_en_202310_01</identifier>
  <identifier type="uri">http://nbn-resolving.de/urn:nbn:de:0009-14-58257</identifier>
  <identifier type="citekey">wesselhöft2023</identifier>
</mods>
Download

Full Metadata