[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Adaptive stock trading strategies with deep reinforcement learning methods
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
Adaptive stock trading strategies with deep reinforcement learning methods Wu, Xing, et al. "Adaptive stock trading strategies with deep reinforcement learning methods." Information Sciences 538 (2020): 142-158. Highlights - Gated Recurrent Unit is proposed to extract informative features from raw financial data. - Reward function is designed with risk-adjusted ratio for trading strategies for s..
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ]Generative Question Refinement with Deep Reinforcement
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
Generative Question Refinement with Deep Reinforcement Liu, Ye, et al. "Generative question refinement with deep reinforcement learning in retrieval-based QA system." Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019. Abstract ์‹ค์ œ QA ์‹œ์Šคํ…œ์—์„œ ์ž˜๋ชป๋œ ๋‹จ์–ด, ์ž˜๋ชป๋œ ๋‹จ์–ด ์ˆœ์„œ, ๋…ธ์ด์ฆˆ ๊ฐ™์€ ์ž˜๋ชป๋œ ํ˜•์‹์˜ ์งˆ๋ฌธ๋“ค์ด ์ผ๋ฐ˜์ ์ด์—ฌ์„œ QA ์‹œ์Šคํ…œ์ด ์ด๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ดํ•ดํ•˜๊ณ  ๋‹ต๋ณ€์„ ๋˜์ง€์ง€ ๋ชปํ•˜๊ฒŒ ๋งŒ๋“ฌ. ์ด๋Ÿฌํ•œ ์ž˜๋ชป๋œ ํ˜•์‹์˜ ์งˆ๋ฌธ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ œ๊ฑฐํ•˜๊ธฐ ..
[๋…ผ๋ฌธ๋ฆฌ๋ทฐ]Multi-DQN An ensemble of Deep Q-learning agents for stock market forecasting
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
Multi-DQN An ensemble of Deep Q-learning agents for stock market forecasting Carta, Salvatore, et al. "Multi-DQN: An ensemble of Deep Q-learning agents for stock market forecasting." Expert systems with applications 164 (2021): 113820. ์ฃผ์š” ํ•˜์ด๋ผ์ดํŠธ A novel ensembling methodology of RL agents with different training experiences. Validation of such ensemble in intraday stock market trading. Different ..
Solve Titanic problem with Reinforcement Learning
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
Introduction - Classification, Prediction using RL like DQN. - Different Existing deep learning is a method of increasing the accuracy of a model through a network. DQN is a reinforcement learning method in which Q values are selected and acted through a model. - Usually, RL does not used to solve classification or Prediction problems. DQN code used in the game is analyzed and refactored to be u..
DQN, A3C
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
DQN (Deep Q Network) DQN is learning using deep learning neural networks such as CNN. Method of storing samples obtained from each time step, randomly selecting these samples to configure and update them into mini-batches. Existing RL, once it start learning in a bad way, it continue learning in a bad way. The problem is solved by randomly extracting and breaking the correlation between samples...
Multi-Agent Reinforcement Learning
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
MARL(Multi-Agent Reinforcement Learning) - Trying to study Multi-Agent Algorithms in reinforcement learning. - For collaboration or competition, it is a field in which multiple agents interact with each other and find optimal behavior. - In reality, in order for Reinforcement learning to actually apply, the characteristics of various fields must be considered, so multi-agent consideration is ess..
[gym] CartPole-v0
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
import gym import time env = gym.make('CartPole-v1') #๊ฐ•ํ™”ํ•™์Šต ํ™˜๊ฒฝ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ for i_episode in range(20): # ์ƒˆ๋กœ์šด ์—ํ”ผ์†Œ๋“œ(initial environment)๋ฅผ ๋ถˆ๋Ÿฌ์˜จ๋‹ค(reset) observation = env.reset() for t in range(100): env.render() #ํ™”๋ฉด์— ์ถœ๋ ฅ / ํ–‰๋™ ์ทจํ•˜๊ธฐ ์ด์ „ ํ™˜๊ฒฝ์—์„œ ์–ป์€ ๊ด€์ฐฐ๊ฐ’(obsevation)์ ์šฉํ•ด์„œ ๊ทธ๋ฆผ time.sleep(0.05) # ํ–‰๋™(action)์„ ์ทจํ•˜๊ธฐ ์ด์ „์— ํ™˜๊ฒฝ์— ๋Œ€ํ•ด ์–ป์€ ๊ด€์ฐฐ๊ฐ’(observation) print('observation before action:') print(observation) action = env.action_space...
Reinforcement learning (๊ฐ•ํ™”ํ•™์Šต)
ยท
Artificial_Intelligence๐Ÿค–/Reinforcement Learning
์•„๋ฌด๊ฒƒ๋„ ์•ˆ์•Œ๋ ค์ฃผ๊ณ  ์ผ๋‹จ ์‹œ๋„ํ•ด๋ณด๊ณ  ์‹œํ–‰์ฐฉ์˜ค ๊ฒช์œผ๋ฉด์„œ ์‹ค๋ ฅ์„ ํ‚ค์›Œ๋‚˜๊ฐ€๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž„. Agent๊ฐ€ ์˜ฌ๋ฐ”๋ฅธ ํ–‰๋™์„ ํ•˜๋ฉด ๋ณด์ƒ(rewards)์„ ์ฃผ๊ณ , ๋ถˆ๋ฆฌํ•œ ํ–‰๋™์„ ํ•˜๋ฉด ๋ฒŒ์ ์„ ๋ถ€์—ฌํ•ด์คŒ. ์ด๋ ‡๊ฒŒ ํ–‰๋™ ํ•˜๋‚˜ํ•˜๋‚˜๊ฐ€ ์Œ“์—ฌ์„œ ๋ณด์ƒ์ด ์ตœ๋Œ€ํ™”๊ฐ€ ๋˜๊ฒŒ ๋งŒ๋“œ๋Š” ํ•™์Šต๋ฐฉ๋ฒ•์ž„. Agent๊ฐ€ ์ž์‹ ์ด ์ž˜ํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์ธ์ง€, ์ž˜ ๋ชปํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์ธ์ง€ ํ™•์‹คํ•˜๊ฒŒ ์•Œ์•„์•ผํ•˜๊ธฐ์— ๋ฌด์กฐ๊ฑด scalar feedback์„ ํ•ด์•ผํ•จ. ex) +1, -3, +2.6, +2 ... ํ™˜๊ฒฝ์— ๋Œ€ํ•œ ์‚ฌ์ „์ง€์‹์ด ์—†๋Š” ์ƒํƒœ๋กœ ํ•™์Šต์ด ์‹œ์ž‘๋˜๊ณ , ๋ณด์ƒ์„ ํ†ตํ•˜์—ฌ ํ•™์Šต์„ ํ•จ. ์–ด๋– ํ•œ ํ–‰๋™์„ ํ–ˆ์„ ๋•Œ ํ™˜๊ฒฝ์ด ์–ด๋–ป๊ฒŒ ๋ฐ˜์‘ํ•˜๋Š”์ง€์— ๋”ฐ๋ฅธ ๋ณด์ƒ์ด ์ฃผ์–ด์ง. ์–ด๋–ป๊ฒŒ ๋ณด์ƒ์ด ์ตœ๋Œ€ํ™”๊ฐ€ ๋  ์ˆ˜ ์žˆ๋Š”์ง€ ํ•™์Šต์„ ํ•˜๋Š” ๊ฒƒ์ด ๊ฐ•ํ™”ํ•™์Šต. Agent : ์ฃผ์–ด์ง„ ๋ฌธ์ œ ๋‚ด์—์„œ ํ–‰๋™์„ ํ•˜๋Š” ์ฃผ์ฒด. Sta..
Liky
'Artificial_Intelligence๐Ÿค–/Reinforcement Learning' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก