Absortio

GitHub - onceupon/Bash-Oneliner: A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.

Jul 22, 2024 21:15 • github.com • GitHub

A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance. - onceupon/Bash-Oneliner

Visit Site View Detail

Share

GitHub - MrCordeiro/barra-ai: No-Nonsense AI Browser Extension

Jul 22, 2024 21:15 • github.com • GitHub

No-Nonsense AI Browser Extension. Contribute to MrCordeiro/barra-ai development by creating an account on GitHub.

Visit Site View Detail

Share

How I Use Obsidian · Jason A. Heppler

Jul 22, 2024 21:14 • jasonheppler.org • Jason Heppler - Historian

I mentioned on my recent appearance on Drafting the Past that I have migrated all of my historical research work into Obsidian, which prompted a few folks on Bluesky to ask about some details on how I use it. Here’s a run-down of what that’s like from the perspective of writing and historical research.

Visit Site View Detail

Share

Polymarket - 2024 Presidential Election Predictions

Jul 22, 2024 21:09 • polymarket.com • Polymarket

2024 Presidential Election Predictions. Trust markets, not presidential polls. Live and accurate forecasts by the world's largest prediction market.

Visit Site View Detail

Share

A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks

Jul 22, 2024 21:08 • arxiv.org • arXiv.org

Large language models (LLMs) have shown remarkable performance on many different Natural Language Processing (NLP) tasks. Prompt engineering plays a key role in adding more to the already existing abilities of LLMs to achieve significant performance gains on various NLP tasks. Prompt engineering requires composing natural language instructions called prompts to elicit knowledge from LLMs in a structured way. Unlike previous state-of-the-art (SoTA) models, prompt engineering does not require extensive parameter re-training or fine-tuning based on the given NLP task and thus solely operates on the embedded knowledge of LLMs. Additionally, LLM enthusiasts can intelligently extract LLMs' knowledge through a basic natural language conversational exchange or prompt engineering, allowing more and more people even without deep mathematical machine learning background to experiment with LLMs. With prompt engineering gaining popularity in the last two years, researchers have come up with numerous engineering techniques around designing prompts to improve accuracy of information extraction from the LLMs. In this paper, we summarize different prompting techniques and club them together based on different NLP tasks that they have been used for. We further granularly highlight the performance of these prompting strategies on various datasets belonging to that NLP task, talk about the corresponding LLMs used, present a taxonomy diagram and discuss the possible SoTA for specific datasets. In total, we read and present a survey of 44 research papers which talk about 39 different prompting methods on 29 different NLP tasks of which most of them have been published in the last two years.

Visit Site View Detail

Share

Context Embeddings for Efficient Answer Generation in RAG

Jul 22, 2024 21:03 • arxiv.org • arXiv.org

Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows down decoding time directly translating to the time a user has to wait for an answer. We address this challenge by presenting COCOM, an effective context compression method, reducing long contexts to only a handful of Context Embeddings speeding up the generation time by a large margin. Our method allows for different compression rates trading off decoding time for answer quality. Compared to earlier methods, COCOM allows for handling multiple contexts more effectively, significantly reducing decoding time for long inputs. Our method demonstrates a speed-up of up to 5.69 $\times$ while achieving higher performance compared to existing efficient context compression methods.

Visit Site View Detail

Share

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Jul 22, 2024 21:02 • arxiv.org • arXiv.org

Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs' powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs' token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4's in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks.

Visit Site View Detail

Share