搜索资源列表
Main
- AI Reinforcement Learning 走格子, 输出每1000步达到目标格子次数。reward: goal-> +1 rest -> 0.-Java implementation for an Reinforcement Learning agent to search through a Grid World from start point to goal state. reward: goal->+1 rest-> 0.
getPoints
- Basis point, 1/100 of one percent, denoted bp, bps, and ‱ Pivot point, a price level of significance in analysis of a financial market that is used as a predictive indicator of market movement Point (mortgage), a percentage sometimes refe
LMin
- 俄罗斯套娃:南北向和东西向的道路纵横交错。现在,路口放着纯金打造的俄罗斯娃娃,重量大小不等,重的都能装下轻的。你可以沿着道路飞奔,拾取路口的娃娃,要求是任何时刻必须是一个套娃,装好后就不能再拆开了。注意不要走重复路。 设计规划路线,使得能够有最大的收获。-Ivan Pavlov in the contest Conference Lectra pack, become the new " Prairie Eagle," has won great honor for the trib
a
- 建一个表示雇员信息的employee类,其中包含数据成员name、empNo和salary,分别表示雇员的姓名、编号和月薪。再从employee类派生出3个类worker、technician和salesman,分别代表普通工人、科研人员、销售人员。三个类中分别包含数据成员productNum、workHours和monthlysales,分别代表工人每月生产产品的数量、科研人员每月工作的时数和销售人员每月的销售额。要求各类中都包含成员函数pay,用来计算雇员的月薪,并假定: 普通工人的月薪
pso
- pso 例子,给出了PSO全局最优的函数,奖励和奖励及惩罚函数-The pso example, PSO global optimal function, reward and incentive and penalty function
java12
- 令狐冲JAVA成绩大于90,并且C成绩大于80分,师傅奖励他,或者 java成绩等于100,音乐成绩大于70,师傅也可以奖励他-Linghu JAVA scores greater than 90, and C score more than 80 points, the master reward him, or the java score equal to 100, music scores greater than 70, the master can reward him
code-and-dataset
- implementation of image cosegmentation using color reward strategy and active contours. I run this code and it was correct. this code is free bye its author Fanman Meng.-implementation of image cosegmentation using color reward strategy and acti
MDPgridworldExample
- 世界是空格自由(0)或障碍物(1)。每转动机器人可以在8个方向移动,或者留在地方。奖励功能,给人一种自由空间,目标定位,高回报。所有其他空格自由具有小的损失,和障碍具有大的负的奖励。值迭代是用来学习的最佳“政策”,即指定一个控制输入到每一个可能的位置的功能。- The world is freespaces (0) or obstacles (1). Each turn the robot can move in 8 directions, or stay in place. A reward
Reward
- 基于MFC,编写一个双色球选号器程序,开发工具为VC++6.0。要求以对话框的模式实现红球和篮球的随机选号,当用户点击“开始”按钮时,开始选号,点击“停止”按钮时,把球的号码显示在对话框内。 -DBased on MFC, the preparation of a double color code selection procedures, development tools for VC++6.0. Request dialog box mode to achieve random se
Q-Learning-master
- Successfully implemented Q-Learning for a simple robot navigation problem of a robot moving on a 5 x 5 grid with one arbitrary goal (reward of +10) and three arbitrary obstacles (reward of -10)
DGP-IRL-master
- We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations.