GitHub - onceupon/Bash-Oneliner: A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.
Jul 22, 2024 21:15 •
github.com
•
GitHub
A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance. - onceupon/Bash-Oneliner
GitHub - MrCordeiro/barra-ai: No-Nonsense AI Browser Extension
Jul 22, 2024 21:15 •
github.com
•
GitHub
No-Nonsense AI Browser Extension. Contribute to MrCordeiro/barra-ai development by creating an account on GitHub.
How I Use Obsidian · Jason A. Heppler
Jul 22, 2024 21:14 •
jasonheppler.org
•
Jason Heppler - Historian
I mentioned on my recent appearance on Drafting the Past that I have migrated all of my historical research work into Obsidian, which prompted a few folks on Bluesky to ask about some details on how I use it. Here’s a run-down of what that’s like from the perspective of writing and historical research.
Polymarket - 2024 Presidential Election Predictions
Jul 22, 2024 21:09 •
polymarket.com
•
Polymarket
2024 Presidential Election Predictions. Trust markets, not presidential polls. Live and accurate forecasts by the world's largest prediction market.
A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks
Jul 22, 2024 21:08 •
arxiv.org
•
arXiv.org
Large language models (LLMs) have shown remarkable performance on many different Natural Language Processing (NLP) tasks. Prompt engineering plays a key role in adding more to the already existing abilities of LLMs to achieve significant performance gains on various NLP tasks. Prompt engineering requires composing natural language instructions called prompts to elicit knowledge from LLMs in a structured way. Unlike previous state-of-the-art (SoTA) models, prompt engineering does not require extensive parameter re-training or fine-tuning based on the given NLP task and thus solely operates on the embedded knowledge of LLMs. Additionally, LLM enthusiasts can intelligently extract LLMs' knowledge through a basic natural language conversational exchange or prompt engineering, allowing more and more people even without deep mathematical machine learning background to experiment with LLMs. With prompt engineering gaining popularity in the last two years, researchers have come up with numerous engineering techniques around designing prompts to improve accuracy of information extraction from the LLMs. In this paper, we summarize different prompting techniques and club them together based on different NLP tasks that they have been used for. We further granularly highlight the performance of these prompting strategies on various datasets belonging to that NLP task, talk about the corresponding LLMs used, present a taxonomy diagram and discuss the possible SoTA for specific datasets. In total, we read and present a survey of 44 research papers which talk about 39 different prompting methods on 29 different NLP tasks of which most of them have been published in the last two years.
Context Embeddings for Efficient Answer Generation in RAG
Jul 22, 2024 21:03 •
arxiv.org
•
arXiv.org
Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows down decoding time directly translating to the time a user has to wait for an answer. We address this challenge by presenting COCOM, an effective context compression method, reducing long contexts to only a handful of Context Embeddings speeding up the generation time by a large margin. Our method allows for different compression rates trading off decoding time for answer quality. Compared to earlier methods, COCOM allows for handling multiple contexts more effectively, significantly reducing decoding time for long inputs. Our method demonstrates a speed-up of up to 5.69 $\times$ while achieving higher performance compared to existing efficient context compression methods.
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Jul 22, 2024 21:02 •
arxiv.org
•
arXiv.org
Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs' powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs' token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4's in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks.
El coste basal del software
Jul 22, 2024 21:00 •
www.eferro.net
Traduccion del articulo original Basal Cost of Software Traduccion realizada por https://x.com/simonvlc y publicada originalmente en su geni...
BeyondPDF - Search with ideas
Jul 22, 2024 20:59 •
omkaark.github.io
Simple PDF search for the sophisticated
Managing up: 3 things I wish I realized sooner
Jul 18, 2024 16:37 •
read.highgrowthengineer.com
•
High Growth Engineer
Removing uncertainty, managing your priorities, and showing a growth mindset