admin管理员组

文章数量:1530041

识别图片地点

Imagine this scenario: you catch-up with a friend who’s returned from a trip to India. You’ve traveled extensively through India and recommended your friend to holiday there. When you meet, your friend tells you about having seen a monkey wearing a red hat in front of a temple. You remember seeing that very same monkey several years ago! You whip out your phone to find the photo as proof. This is how the next few minutes of conversation might sound: “I know it’s here somewhere! No, no. It was before we visited the waterfall. Hmm. It was after the beach, I’m pretty sure…” Frustrating, right?

想象一下这种情况:您遇到了一个从印度之旅中回来的朋友。 您已经遍历印度,并推荐您的朋友去印度度假。 见面时,您的朋友告诉您有关看到一只猴子在庙宇前面戴红色帽子的故事。 您还记得几年前见过同一只猴子! 您拨出手机以查找照片作为证据。 接下来的几分钟可能是这样的:“我知道它在这里! 不,不。 那是在我们参观瀑布之前。 嗯 是在海滩之后,我敢肯定……”令人沮丧,对吧?

Holidays usually last a week or two and involve taking continuous holiday snaps. Finding a single image on your phone taken some time ago can be tricky. This can be frustrating, but not impossible. We intuitively replay a sequence of mental images that provide information about our location at any one point in time.

假期通常持续一两个星期,并且需要连续进行假期。 在手机上查找某段时间前拍摄的单个图像可能很棘手。 这可能令人沮丧,但并非不可能。 我们直观地重播一系列心理图像,这些图像可提供有关我们在任何时间点的位置的信息。

地点识别 (Place Recognition)

This recognition and recall of previously seen information is what makes up the problem of place recognition. Finding the holiday snap of that one monkey wearing a red hat on your phone involves the same mental navigation. As you thumb through images, your mind goes back in time and mentally replays the duration of your holiday, localizing when and where you saw the monkey.

对先前看到的信息的这种识别和回忆是构成位置识别问题的原因。 找到那只猴子在手机上戴红色帽子的假日快照涉及相同的心理导航。 当您翻阅图像时,您的思维会回到过去,并在心理上重播假期的时间,从而确定您何时何地看到了猴子。

Here’s another way of explaining it. When you give someone directions to a location, it’s common to say something like: ‘Go straight ahead until you see the famous burger place, take a left and then you’ll see a pharmacy right in front of you. Take a right there….’. All this information is useless if they are unable to match what they see to what you described.

这是另一种解释方式。 当您向某人指示某个地点的路线时,通常会说类似这样的话:“直走,直到看到著名的汉堡店,向左走,然后您会在眼前看到一家药房。 在那儿右转…。'。 如果他们无法将看到的内容与您描述的内容相匹配,则所有这些信息都将毫无用处。

更大范围 (At Larger Scales)

Now, imagine if your mobile phone never stopped capturing images, day and night. It would be virtually impossible to navigate through all the content to find one image of a monkey wearing a red hat from a sequence of holiday snaps taken months or years ago. This is precisely the problem autonomous vehicles have to solve. In place of photos on a mobile phone, these robots must make sense of a continuous stream of video sequences (equivalent to millions of images) captured while in motion throughout their operational lifetime.

现在,想象一下您的手机是否昼夜不停地捕获图像。 从几个月或几年前拍摄的一系列假日快照中,几乎不可能浏览所有内容来找到一个戴着红色帽子的猴子的图像。 这正是自动驾驶汽车必须解决的问题。 这些机器人必须替代移动电话上的照片,才能感觉到在其整个使用寿命中运动时所捕获的连续视频序列流(相当于数百万个图像)。

For robots, successful localization over large scale observations happens through the process of, you guessed it, scalable place recognition.

对于机器人来说,您可以猜到它是可扩展的位置识别的过程,因此可以成功进行大规模观测的本地化。

机器人像人类一样解决问题 (Robots Problem-Solve Like Humans)

As humans, we carry a representation of the world in our heads all the time. It’s a similar situation for autonomous vehicles. In order to make effective decisions ‘on the go’, self-driving cars must make sense of a never-ending sequence of images as quickly as possible.

作为人类,我们所向披靡的表现在我们头上所有的时间 。 无人驾驶汽车也有类似情况。 为了在旅途中做出有效的决定,自动驾驶汽车必须尽可能快地理解无休止的图像序列。

Humans and robots also share an ability to access outside help in the form of GPS navigation/localization tools. GPS, however, is not accurate enough for all tasks in all scenarios — underwater, underground, on Mars! This is problematic when it comes to the reliability of autonomous cars and their ability to make accurate, real-time decisions. Fortunately, if robots, including automated vehicles, have seen something before, this information should help self-localization.

人类和机器人还具有以GPS导航/定位工具的形式访问外部帮助的能力。 但是,GPS在所有情况下(水下,地下,火星上)的所有任务都不够准确! 当涉及到自动驾驶汽车的可靠性及其做出准确,实时决策的能力时,这是有问题的。 幸运的是,如果机器人(包括自动驾驶汽车)以前见过某物,则此信息应有助于自我定位。

Think back to a time when you were lost in a new city. After randomly walking around, you suddenly see a building you noticed earlier, maybe because of its color or size. Thanks to the place recognition algorithm running in your head, you work out where the building is situated in relation to your hotel, allowing you a safe return. We don’t consider a single image or scene to recognize where we are. We make sense of things by playing back a sequence of connected images (and memories) to navigate back to your hotel.

回想一下你在一个新城市迷路的时候。 随机走动后,您可能会突然看到先前注意到的建筑物,可能是因为其颜色或大小。 得益于大脑中运行的位置识别算法,您可以算出建筑物相对于酒店的位置,从而安全返回。 我们不会考虑单个图像或场景来识别我们的位置。 我们通过播放一系列相连的图像(和记忆)来导航回您的酒店,从而使事情变得有意义。

As part of the Australian Centre for Robot Vision, we’ve applied this same ‘human’ strategy to robots. We use scalable place recognition to match what a robot is actively seeing to millions of previously observed images. Individual images may not be informative enough for localisation. However, gathering bits and pieces of evidence from each image and using that to reason over sequences has shown great promise for localisation, even when the appearance of images changes due to weather, time of day, etc.

作为澳大利亚机器人视觉中心的一部分,我们已将相同的“人类”策略应用于机器人。 我们使用可缩放的位置识别来将机器人正在主动看到的内容与数百万个先前观察到的图像进行匹配。 单个图像可能不足以提供本地化信息。 但是,即使当图像的外观由于天气,一天中的时间等发生变化时,从每幅图像中收集点点证据并用于对序列进行推理也显示出了很好的定位潜力。

机器学习可以拯救吗? (Machine learning to the rescue?)

While machine learning is used to solve most problems in the field of robotics, our focus on sequential reasoning takes a slightly old-fashioned route. The good news, as shown in our work at 2019 ICCV, is that sequential reasoning outperforms bespoke deep learning-based approaches to solving the problem of scalable place recognition for robots. This surprised us as well!

虽然机器学习被用来解决机器人领域的大多数问题,但我们对顺序推理的关注却有些过时。 正如我们在2019 ICCV上的工作所示,好消息是顺序推理胜过定制基于深度学习的方法来解决机器人可扩展位置识别的问题。 这也让我们感到惊讶!

A big limitation of current deep learning methods is their inability to generalize to unseen scenarios. By contrast, we have shown our method works ‘out of the box’ for a wide range of different environments trialed. Additionally, because there are no learned components in sequential reasoning, our methods is not limited by scope and can be informed by millions of images as soon as they are seen.

当前深度学习方法的一大局限性在于它们无法推广到看不见的场景。 相比之下,我们已经展示了我们的方法可以在各种不同的环境中“开箱即用”地工作。 此外,由于在顺序推理中没有学习到的组成部分,因此我们的方法不受范围的限制,一旦被看到,便可以被数百万张图像告知。

未来之路:后续步骤和挑战 (The Road Ahead: Next Steps and Challenges)

Autonomous cars will soon become a reality on our roads. To navigate safely, these future robots will need to see and understand countless thousand images each time they drive on a road.

自动驾驶汽车将很快在我们的道路上成为现实。 为了安全导航,这些未来的机器人每次在道路上行驶时,都需要查看和理解无数的图像。

There are still challenges to overcome. For example, as humans, we have little problem working out where we are, even in changing conditions. In dark or stormy conditions, we can still look out the window of our car during a stormy dark night and still be able to localise where we are. Robots, however, still find it hard to interpret images when the appearance of a location changes due to external/environmental factors (light, weather, obstructions etc).

仍然需要克服的挑战。 例如,作为人类,即使在不断变化的环境中 ,我们几乎也不会遇到问题。 在黑暗或暴风雨的条件下,我们仍然可以在暴风雨的漆黑夜里看着车窗外,仍然可以定位我们所在的位置。 但是,当位置的外观由于外部/环境因素(光线,天气,障碍物等)而发生变化时,机器人仍然难以理解图像。

A more immediate problem to overcome is storage. All the images seen by an autonomous vehicle need to be stored in memory to enable sequential reasoning about them.

需要解决的一个更直接的问题是存储 。 无人驾驶汽车看到的所有图像都需要存储在内存中,以便能够对其进行顺序推理。

The current challenge for us is to come up with a ‘life-long place recognition’ method that will continue to operate under every conceivable circumstance — no matter the weather or time of day — and across large scales. We want to combine what we know presently about our location to make predictions about where we will be in the next instant in time. Using this information, we can then limit the set of images that we need to reason over. The reasoning behind it goes something like this. If I know I am somewhere in the Adelaide CBD, it’s highly unlikely (actually, impossible in the absence of teleportation) that, in the next moment, I’ll find myself somewhere in Perth. Therefore, let’s only look at images close to my current location and mark everything else as ‘not likely’.

当前我们面临的挑战是提出一种“ 终生位置识别 ”方法,该方法将在各种可能的情况下(无论天气或一天中的时间如何)在大范围内继续运行。 我们想结合我们目前对位置的了解,对下一个瞬间的位置做出预测 。 然后,使用此信息,我们可以限制需要推理的图像集。 其背后的原因是这样的。 如果我知道自己在阿德莱德中央商务区的某个地方,那么极不可能(实际上,在没有远距离交通的情况下是不可能的),在下一刻,我会发现自己在珀斯的某个地方。 因此,让我们仅查看靠近我当前位置的图像,并将其他所有内容标记为“不太可能”。

We are hopeful that by combining sequential analysis with memory management, we will be able to achieve a method that enables an autonomous vehicle to localise itself over arbitrarily large image collections.

我们希望,通过将顺序分析与内存管理相结合,我们将能够实现一种方法,使自动驾驶汽车能够将自己定位在任意大的图像集合上。

Imagine this future scenario: autonomous cars that can work together to capture a real-time snapshot of what the world looks like at any given moment and how it changes from day to day. Such large-scale place recognition methods could enable precise localization for each self-driving car by looking at the world through the combined eyes of all cars.

想象一下这种未来情况:自动驾驶汽车可以协同工作,以实时捕获世界在任何给定时刻的样子以及其每天如何变化的快照。 这样的大规模位置识别方法可以通过用所有汽车的组合眼睛观察世界来实现每辆自动驾驶汽车的精确定位。

Exciting times, indeed!

确实令人兴奋的时代!

翻译自: https://towardsdatascience/how-do-robots-find-themselves-in-an-ever-changing-world-19eda1956c56

识别图片地点

本文标签: 地点图片