The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.

VLN: LLM and CLIP for Instance-Specific Navigation on 3D Maps

Abstract and 1 Introduction

  1. Related Works

    2.1. Vision-and-Language Navigation

    2.2. Semantic Scene Understanding and Instance Segmentation

    2.3. 3D Scene Reconstruction

  2. Methodology

    3.1. Data Collection

    3.2. Open-set Semantic Information from Images

    3.3. Creating the Open-set 3D Representation

    3.4. Language-Guided Navigation

  3. Experiments

    4.1. Quantitative Evaluation

    4.2. Qualitative Results

  4. Conclusion and Future Work, Disclosure statement, and References

3.4. Language-Guided Navigation

In this section, we leverage the LLM-based approach from [1], which uses ChatGPT [35] to understand and map language commands to pre-defined function primitives that the robot can understand and execute. However, there are a few differences between our current approach and the approach in [1] regarding the use case of the LLM and the implementation of our function primitives. The previous approach used the LLM’s ability to bring in an open-set understanding by mapping general queries to the already-known closed-set class labels obtained via Mask2Former [7].

\ However, given the open-set nature of our new representation, O3D-SIM, the LLM does not need to do that. Figure 4 shows both approaches’ code output differences. The function primitives work similarly to the older approach, requiring the desired object type and its instance as an input. But now, the desired object is not from a pre-defined set of classes but a small query defining the object, so the implementation to find the desired location changes. We use the text and image-aligned nature of CLIP embeddings to find the desired object, where the input description is passed to the model, and its corresponding embedding is used to find the object in O3D-SIM.

\ A cosine similarity is calculated between the embedding of the description and all the embeddings of our representation. These are ranked in a decreasing order, and the desired instance is selected. Once the instance is finalized, a goal corresponding to this instance is generated and passed to the navigation stack for autonomous navigation of the robot, hence achieving Language-Guided Navigation.

\

:::info Authors:

(1) Laksh Nanwani, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(2) Kumaraditya Gupta, International Institute of Information Technology, Hyderabad, India;

(3) Aditya Mathur, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(4) Swayam Agrawal, International Institute of Information Technology, Hyderabad, India;

(5) A.H. Abdul Hafez, Hasan Kalyoncu University, Sahinbey, Gaziantep, Turkey;

(6) K. Madhava Krishna, International Institute of Information Technology, Hyderabad, India.

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Market Opportunity
Large Language Model Logo
Large Language Model Price(LLM)
$0.0003435
$0.0003435$0.0003435
-0.72%
USD
Large Language Model (LLM) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Woodway Assurance receives $1 million in funding for data privacy assurance solution EviData

Woodway Assurance receives $1 million in funding for data privacy assurance solution EviData

OTTAWA, ON, Dec. 17, 2025 /PRNewswire/ – New Canadian technology company Woodway Assurance is proud to announce that it has closed an oversubscribed seed funding
Share
AI Journal2025/12/17 23:16
OpenVPP accused of falsely advertising cooperation with the US government; SEC commissioner clarifies no involvement

OpenVPP accused of falsely advertising cooperation with the US government; SEC commissioner clarifies no involvement

PANews reported on September 17th that on-chain sleuth ZachXBT tweeted that OpenVPP ( $OVPP ) announced this week that it was collaborating with the US government to advance energy tokenization. SEC Commissioner Hester Peirce subsequently responded, stating that the company does not collaborate with or endorse any private crypto projects. The OpenVPP team subsequently hid the response. Several crypto influencers have participated in promoting the project, and the accounts involved have been questioned as typical influencer accounts.
Share
PANews2025/09/17 23:58
BlackRock boosts AI and US equity exposure in $185 billion models

BlackRock boosts AI and US equity exposure in $185 billion models

The post BlackRock boosts AI and US equity exposure in $185 billion models appeared on BitcoinEthereumNews.com. BlackRock is steering $185 billion worth of model portfolios deeper into US stocks and artificial intelligence. The decision came this week as the asset manager adjusted its entire model suite, increasing its equity allocation and dumping exposure to international developed markets. The firm now sits 2% overweight on stocks, after money moved between several of its biggest exchange-traded funds. This wasn’t a slow shuffle. Billions flowed across multiple ETFs on Tuesday as BlackRock executed the realignment. The iShares S&P 100 ETF (OEF) alone brought in $3.4 billion, the largest single-day haul in its history. The iShares Core S&P 500 ETF (IVV) collected $2.3 billion, while the iShares US Equity Factor Rotation Active ETF (DYNF) added nearly $2 billion. The rebalancing triggered swift inflows and outflows that realigned investor exposure on the back of performance data and macroeconomic outlooks. BlackRock raises equities on strong US earnings The model updates come as BlackRock backs the rally in American stocks, fueled by strong earnings and optimism around rate cuts. In an investment letter obtained by Bloomberg, the firm said US companies have delivered 11% earnings growth since the third quarter of 2024. Meanwhile, earnings across other developed markets barely touched 2%. That gap helped push the decision to drop international holdings in favor of American ones. Michael Gates, lead portfolio manager for BlackRock’s Target Allocation ETF model portfolio suite, said the US market is the only one showing consistency in sales growth, profit delivery, and revisions in analyst forecasts. “The US equity market continues to stand alone in terms of earnings delivery, sales growth and sustainable trends in analyst estimates and revisions,” Michael wrote. He added that non-US developed markets lagged far behind, especially when it came to sales. This week’s changes reflect that position. The move was made ahead of the Federal…
Share
BitcoinEthereumNews2025/09/18 01:44