Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditionalize use of POSIX features missing on WASI/WebAssembly #92677

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

whitequark
Copy link
Collaborator

@whitequark whitequark commented May 19, 2024

See also the RFC.


This patch makes it possible to build LLVM, Clang, and LLD for WASI/WebAssembly. This patch introduces conditionals of the form defined(__wasi__) or defined(__wasm__) wherever necessary to detect the use of the WASI platform. In addition, it introduces a HAVE_SETJMP feature test macro because the WASI platform can have or lack support for this feature depending on compiler options.

@llvmbot llvmbot added cmake Build system in general and CMake in particular clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' llvm:support labels May 19, 2024
@whitequark whitequark requested review from Bigcheese and resistor May 19, 2024 05:08
@llvmbot
Copy link
Member

llvmbot commented May 19, 2024

@llvm/pr-subscribers-lld-coff
@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-lld-macho
@llvm/pr-subscribers-llvm-adt
@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-clang-driver

Author: Catherine (whitequark)

Changes

This patch is the first in a series that makes it possible to build LLVM, Clang, and LLD for WASI/WebAssembly. This patch does not introduce conditionals of the form defined(__wasi__) or defined(__wasm__); instead it detects these APIs like any other platform features. While some features are never present in WASI and the feature check is functionally identical to a platform check, some may be conditionally present if emulation is turned on, e.g. getpid.

The complete list of feature tests introduced is:

  • HAVE_ALARM: WebAssembly does not support delivery of asynchronous signals.
  • HAVE_GETHOSTNAME: WASI does not provide gethostname.
  • HAVE_GETPID: WASI does not have process IDs. However, it ships with a polyfill (-D_WASI_EMULATED_GETPID). The final behavior is the same either way.
  • HAVE_FCHOWN: WASI likely will never support UNIX ownership model.
  • HAVE_PWD_H: WASI likely will never support UNIX password databases.
  • HAVE_RAISE: WASI does not support delivey of asynchronous signals. However, it ships with a polyfill (-D_WASI_EMULATED_SIGNAL). This polyfill implements raise but not sigaction (only signal) and as a result it is not currently useful for LLVM. If sigaction is implemented in wasi-libc then raise(SIGABRT); would be able to invoke the handler.
  • HAVE_SETJMP: WebAssembly implements SjLj using exception handling. Exception handling has not been stabilized in Wasm yet. In addition, it significantly increases deployment complexity. Building with -mllvm -wasm-enable-sjlj enables the use of SjLj.
  • HAVE_SOCKET: WASIp1 does not provide the Berkeley socket API. WASIp2 does provide it. It will likely remain desirable to target WASIp1 for a long time.
  • HAVE_SYS_WAIT_H: WASI does not have subprocess management.
  • HAVE_UMASK: WASI likely will never support UNIX permission model, but umask might eventually be added in a polyfill.

Full diff: https://github.com/llvm/llvm-project/pull/92677.diff

11 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+3-3)
  • (modified) llvm/cmake/config-ix.cmake (+15)
  • (modified) llvm/include/llvm/Config/config.h.cmake (+33)
  • (modified) llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp (+6)
  • (modified) llvm/lib/Support/CrashRecoveryContext.cpp (+14-1)
  • (modified) llvm/lib/Support/LockFileManager.cpp (+1-1)
  • (modified) llvm/lib/Support/Unix/Path.inc (+30)
  • (modified) llvm/lib/Support/Unix/Process.inc (+4)
  • (modified) llvm/lib/Support/Unix/Unix.h (+4-1)
  • (modified) llvm/lib/Support/Unix/Watchdog.inc (+2-2)
  • (modified) llvm/lib/Support/raw_socket_stream.cpp (+4)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 2868b4f2b02e9..f8b7c76e6439d 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -99,8 +99,8 @@
 #include <optional>
 #include <set>
 #include <utility>
-#if LLVM_ON_UNIX
-#include <unistd.h> // getpid
+#if HAVE_GETPID
+#include <unistd.h>
 #endif
 
 using namespace clang::driver;
@@ -1577,7 +1577,7 @@ bool Driver::getCrashDiagnosticFile(StringRef ReproCrashFilename,
     CrashDiagDir = "/";
   path::append(CrashDiagDir, "Library/Logs/DiagnosticReports");
   int PID =
-#if LLVM_ON_UNIX
+#if HAVE_GETPID
       getpid();
 #else
       0;
diff --git a/llvm/cmake/config-ix.cmake b/llvm/cmake/config-ix.cmake
index bf1b110245bb2..a1a76bb6241db 100644
--- a/llvm/cmake/config-ix.cmake
+++ b/llvm/cmake/config-ix.cmake
@@ -49,6 +49,7 @@ check_include_file(malloc/malloc.h HAVE_MALLOC_MALLOC_H)
 if( NOT PURE_WINDOWS )
   check_include_file(pthread.h HAVE_PTHREAD_H)
 endif()
+check_include_file(pwd.h HAVE_PWD_H)
 check_include_file(signal.h HAVE_SIGNAL_H)
 check_include_file(sys/ioctl.h HAVE_SYS_IOCTL_H)
 check_include_file(sys/mman.h HAVE_SYS_MMAN_H)
@@ -57,6 +58,7 @@ check_include_file(sys/resource.h HAVE_SYS_RESOURCE_H)
 check_include_file(sys/stat.h HAVE_SYS_STAT_H)
 check_include_file(sys/time.h HAVE_SYS_TIME_H)
 check_include_file(sys/types.h HAVE_SYS_TYPES_H)
+check_include_file(sys/wait.h HAVE_SYS_WAIT_H)
 check_include_file(sysexits.h HAVE_SYSEXITS_H)
 check_include_file(termios.h HAVE_TERMIOS_H)
 check_include_file(unistd.h HAVE_UNISTD_H)
@@ -276,11 +278,15 @@ check_symbol_exists(__deregister_frame "${CMAKE_CURRENT_LIST_DIR}/unwind.h" HAVE
 check_symbol_exists(__unw_add_dynamic_fde "${CMAKE_CURRENT_LIST_DIR}/unwind.h" HAVE_UNW_ADD_DYNAMIC_FDE)
 
 check_symbol_exists(_Unwind_Backtrace "unwind.h" HAVE__UNWIND_BACKTRACE)
+check_symbol_exists(alarm unistd.h HAVE_ALARM)
+check_symbol_exists(gethostname unistd.h HAVE_GETHOSTNAME)
 check_symbol_exists(getpagesize unistd.h HAVE_GETPAGESIZE)
+check_symbol_exists(getpid unistd.h HAVE_GETPID)
 check_symbol_exists(sysconf unistd.h HAVE_SYSCONF)
 check_symbol_exists(getrusage sys/resource.h HAVE_GETRUSAGE)
 check_symbol_exists(setrlimit sys/resource.h HAVE_SETRLIMIT)
 check_symbol_exists(isatty unistd.h HAVE_ISATTY)
+check_symbol_exists(fchown unistd.h HAVE_FCHOWN)
 check_symbol_exists(futimens sys/stat.h HAVE_FUTIMENS)
 check_symbol_exists(futimes sys/time.h HAVE_FUTIMES)
 # AddressSanitizer conflicts with lib/Support/Unix/Signals.inc
@@ -298,10 +304,14 @@ check_symbol_exists(malloc_zone_statistics malloc/malloc.h
 check_symbol_exists(getrlimit "sys/types.h;sys/time.h;sys/resource.h" HAVE_GETRLIMIT)
 check_symbol_exists(posix_spawn spawn.h HAVE_POSIX_SPAWN)
 check_symbol_exists(pread unistd.h HAVE_PREAD)
+check_symbol_exists(raise signal.h HAVE_RAISE)
 check_symbol_exists(sbrk unistd.h HAVE_SBRK)
+check_symbol_exists(setjmp setjmp.h HAVE_SETJMP)
+check_symbol_exists(socket sys/socket.h HAVE_SOCKET)
 check_symbol_exists(strerror_r string.h HAVE_STRERROR_R)
 check_symbol_exists(strerror_s string.h HAVE_DECL_STRERROR_S)
 check_symbol_exists(setenv stdlib.h HAVE_SETENV)
+check_symbol_exists(umask sys/stat.h HAVE_UMASK)
 if( PURE_WINDOWS )
   check_symbol_exists(_chsize_s io.h HAVE__CHSIZE_S)
 
@@ -327,6 +337,11 @@ if( PURE_WINDOWS )
   check_function_exists(__cmpdi2 HAVE___CMPDI2)
 endif()
 
+check_c_source_compiles("
+        struct flock lk;
+        int main(void) { return 0; }"
+        HAVE_FLOCK)
+
 CHECK_STRUCT_HAS_MEMBER("struct stat" st_mtimespec.tv_nsec
     "sys/types.h;sys/stat.h" HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC)
 if (UNIX AND ${CMAKE_SYSTEM_NAME} MATCHES "AIX")
diff --git a/llvm/include/llvm/Config/config.h.cmake b/llvm/include/llvm/Config/config.h.cmake
index 977c182e9d2b0..8cbefffb2121d 100644
--- a/llvm/include/llvm/Config/config.h.cmake
+++ b/llvm/include/llvm/Config/config.h.cmake
@@ -23,6 +23,9 @@
    backslashes. */
 #cmakedefine01 LLVM_WINDOWS_PREFER_FORWARD_SLASH
 
+/* Define to 1 if you have the `alarm' function. */
+#cmakedefine HAVE_ALARM ${HAVE_ALARM}
+
 /* Define to 1 if you have the `backtrace' function. */
 #cmakedefine HAVE_BACKTRACE ${HAVE_BACKTRACE}
 
@@ -71,6 +74,12 @@
 /* Define to 1 if you have the <errno.h> header file. */
 #cmakedefine HAVE_ERRNO_H ${HAVE_ERRNO_H}
 
+/* Define to 1 if you have the `flock' structure. */
+#cmakedefine HAVE_FLOCK ${HAVE_FLOCK}
+
+/* Define to 1 if you have the `fchown' function. */
+#cmakedefine HAVE_FCHOWN ${HAVE_FCHOWN}
+
 /* Define to 1 if you have the <fcntl.h> header file. */
 #cmakedefine HAVE_FCNTL_H ${HAVE_FCNTL_H}
 
@@ -92,9 +101,15 @@
 /* Define to 1 if you have the `futimes' function. */
 #cmakedefine HAVE_FUTIMES ${HAVE_FUTIMES}
 
+/* Define to 1 if you have the `gethostname' function. */
+#cmakedefine HAVE_GETHOSTNAME ${HAVE_GETHOSTNAME}
+
 /* Define to 1 if you have the `getpagesize' function. */
 #cmakedefine HAVE_GETPAGESIZE ${HAVE_GETPAGESIZE}
 
+/* Define to 1 if you have the `getpid' function. */
+#cmakedefine HAVE_GETPID ${HAVE_GETPID}
+
 /* Define to 1 if you have the `getrlimit' function. */
 #cmakedefine HAVE_GETRLIMIT ${HAVE_GETRLIMIT}
 
@@ -161,9 +176,21 @@
 /* Have pthread_rwlock_init */
 #cmakedefine HAVE_PTHREAD_RWLOCK_INIT ${HAVE_PTHREAD_RWLOCK_INIT}
 
+/* Define to 1 if you have the <pwd.h> header file. */
+#cmakedefine HAVE_PWD_H ${HAVE_PWD_H}
+
+/* Define to 1 if you have the `raise' function. */
+#cmakedefine HAVE_RAISE ${HAVE_RAISE}
+
 /* Define to 1 if you have the `sbrk' function. */
 #cmakedefine HAVE_SBRK ${HAVE_SBRK}
 
+/* Define to 1 if you have the `setjmp' function. */
+#cmakedefine HAVE_SETJMP ${HAVE_SETJMP}
+
+/* Define to 1 if you have the `socket' function. */
+#cmakedefine HAVE_SOCKET ${HAVE_SOCKET}
+
 /* Define to 1 if you have the `setenv' function. */
 #cmakedefine HAVE_SETENV ${HAVE_SETENV}
 
@@ -209,12 +236,18 @@
 /* Define to 1 if you have the <sys/types.h> header file. */
 #cmakedefine HAVE_SYS_TYPES_H ${HAVE_SYS_TYPES_H}
 
+/* Define to 1 if you have the <sys/wait.h> header file. */
+#cmakedefine HAVE_SYS_WAIT_H ${HAVE_SYS_WAIT_H}
+
 /* Define if the setupterm() function is supported this platform. */
 #cmakedefine LLVM_ENABLE_TERMINFO ${LLVM_ENABLE_TERMINFO}
 
 /* Define to 1 if you have the <termios.h> header file. */
 #cmakedefine HAVE_TERMIOS_H ${HAVE_TERMIOS_H}
 
+/* Define to 1 if you have the `umask' function. */
+#cmakedefine HAVE_UMASK ${HAVE_UMASK}
+
 /* Define to 1 if you have the <unistd.h> header file. */
 #cmakedefine HAVE_UNISTD_H ${HAVE_UNISTD_H}
 
diff --git a/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp b/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
index 4f8f883a75f32..345a80e70f241 100644
--- a/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
+++ b/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
@@ -34,7 +34,9 @@
 #include "llvm/Support/raw_ostream.h"
 #include <cassert>
 #include <cmath>
+#if HAVE_RAISE
 #include <csignal>
+#endif
 #include <cstdint>
 #include <cstdio>
 #include <cstring>
@@ -340,7 +342,11 @@ static GenericValue lle_X_exit(FunctionType *FT, ArrayRef<GenericValue> Args) {
 static GenericValue lle_X_abort(FunctionType *FT, ArrayRef<GenericValue> Args) {
   //FIXME: should we report or raise here?
   //report_fatal_error("Interpreted program raised SIGABRT");
+#if HAVE_RAISE
   raise (SIGABRT);
+#else
+  abort();
+#endif
   return GenericValue();
 }
 
diff --git a/llvm/lib/Support/CrashRecoveryContext.cpp b/llvm/lib/Support/CrashRecoveryContext.cpp
index f53aea177d612..d25590d4ff3a6 100644
--- a/llvm/lib/Support/CrashRecoveryContext.cpp
+++ b/llvm/lib/Support/CrashRecoveryContext.cpp
@@ -13,8 +13,13 @@
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/thread.h"
 #include <cassert>
+#if HAVE_RAISE
+#include <csignal>
+#endif
 #include <mutex>
+#if HAVE_SETJMP
 #include <setjmp.h>
+#endif
 
 using namespace llvm;
 
@@ -31,7 +36,9 @@ struct CrashRecoveryContextImpl {
   const CrashRecoveryContextImpl *Next;
 
   CrashRecoveryContext *CRC;
+#ifdef HAVE_SETJMP
   ::jmp_buf JumpBuffer;
+#endif
   volatile unsigned Failed : 1;
   unsigned SwitchedThread : 1;
   unsigned ValidJumpBuffer : 1;
@@ -72,9 +79,11 @@ struct CrashRecoveryContextImpl {
 
     CRC->RetCode = RetCode;
 
+#if HAVE_SETJMP
     // Jump back to the RunSafely we were called under.
     if (ValidJumpBuffer)
       longjmp(JumpBuffer, 1);
+#endif
 
     // Otherwise let the caller decide of the outcome of the crash. Currently
     // this occurs when using SEH on Windows with MSVC or clang-cl.
@@ -417,10 +426,12 @@ bool CrashRecoveryContext::RunSafely(function_ref<void()> Fn) {
     CrashRecoveryContextImpl *CRCI = new CrashRecoveryContextImpl(this);
     Impl = CRCI;
 
+#if HAVE_SETJMP
     CRCI->ValidJumpBuffer = true;
     if (setjmp(CRCI->JumpBuffer) != 0) {
       return false;
     }
+#endif
   }
 
   Fn();
@@ -469,9 +480,11 @@ bool CrashRecoveryContext::throwIfCrash(int RetCode) {
     return false;
 #if defined(_WIN32)
   ::RaiseException(RetCode, 0, 0, NULL);
-#else
+#elif HAVE_RAISE
   llvm::sys::unregisterHandlers();
   raise(RetCode - 128);
+#else
+  abort();
 #endif
   return true;
 }
diff --git a/llvm/lib/Support/LockFileManager.cpp b/llvm/lib/Support/LockFileManager.cpp
index 083f8d7b37be3..edfa8825b3337 100644
--- a/llvm/lib/Support/LockFileManager.cpp
+++ b/llvm/lib/Support/LockFileManager.cpp
@@ -94,7 +94,7 @@ static std::error_code getHostID(SmallVectorImpl<char> &HostID) {
   StringRef UUIDRef(UUIDStr);
   HostID.append(UUIDRef.begin(), UUIDRef.end());
 
-#elif LLVM_ON_UNIX
+#elif HAVE_GETHOSTNAME
   char HostName[256];
   HostName[255] = 0;
   HostName[0] = 0;
diff --git a/llvm/lib/Support/Unix/Path.inc b/llvm/lib/Support/Unix/Path.inc
index 6e679f74869f0..250e946781446 100644
--- a/llvm/lib/Support/Unix/Path.inc
+++ b/llvm/lib/Support/Unix/Path.inc
@@ -32,7 +32,9 @@
 #endif
 
 #include <dirent.h>
+#ifdef HAVE_PWD_H
 #include <pwd.h>
+#endif
 #include <sys/file.h>
 
 #ifdef __APPLE__
@@ -673,6 +675,11 @@ static void expandTildeExpr(SmallVectorImpl<char> &Path) {
     return;
   }
 
+#if !defined(HAVE_PWD_H)
+  // No access to password database, return back the original path.
+  (void)Remainder;
+  return;
+#else
   // This is a string of the form ~username/, look up this user's entry in the
   // password database.
   std::unique_ptr<char[]> Buf;
@@ -694,6 +701,7 @@ static void expandTildeExpr(SmallVectorImpl<char> &Path) {
   Path.clear();
   Path.append(Entry->pw_dir, Entry->pw_dir + strlen(Entry->pw_dir));
   llvm::sys::path::append(Path, Storage);
+#endif
 }
 
 void expand_tilde(const Twine &path, SmallVectorImpl<char> &dest) {
@@ -770,11 +778,15 @@ std::error_code status(int FD, file_status &Result) {
 }
 
 unsigned getUmask() {
+#if HAVE_UMASK
   // Chose arbitary new mask and reset the umask to the old mask.
   // umask(2) never fails so ignore the return of the second call.
   unsigned Mask = ::umask(0);
   (void)::umask(Mask);
   return Mask;
+#else
+  return 0022;
+#endif
 }
 
 std::error_code setPermissions(const Twine &Path, perms Permissions) {
@@ -1224,6 +1236,7 @@ Expected<size_t> readNativeFileSlice(file_t FD, MutableArrayRef<char> Buf,
 }
 
 std::error_code tryLockFile(int FD, std::chrono::milliseconds Timeout) {
+#if HAVE_FLOCK
   auto Start = std::chrono::steady_clock::now();
   auto End = Start + Timeout;
   do {
@@ -1241,9 +1254,13 @@ std::error_code tryLockFile(int FD, std::chrono::milliseconds Timeout) {
     usleep(1000);
   } while (std::chrono::steady_clock::now() < End);
   return make_error_code(errc::no_lock_available);
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code lockFile(int FD) {
+#if HAVE_FLOCK
   struct flock Lock;
   memset(&Lock, 0, sizeof(Lock));
   Lock.l_type = F_WRLCK;
@@ -1253,9 +1270,13 @@ std::error_code lockFile(int FD) {
   if (::fcntl(FD, F_SETLKW, &Lock) != -1)
     return std::error_code();
   return errnoAsErrorCode();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code unlockFile(int FD) {
+#if HAVE_FLOCK
   struct flock Lock;
   Lock.l_type = F_UNLCK;
   Lock.l_whence = SEEK_SET;
@@ -1264,6 +1285,9 @@ std::error_code unlockFile(int FD) {
   if (::fcntl(FD, F_SETLK, &Lock) != -1)
     return std::error_code();
   return errnoAsErrorCode();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code closeFile(file_t &F) {
@@ -1335,11 +1359,15 @@ std::error_code real_path(const Twine &path, SmallVectorImpl<char> &dest,
 }
 
 std::error_code changeFileOwnership(int FD, uint32_t Owner, uint32_t Group) {
+#if HAVE_FCHOWN
   auto FChown = [&]() { return ::fchown(FD, Owner, Group); };
   // Retry if fchown call fails due to interruption.
   if ((sys::RetryAfterSignal(-1, FChown)) < 0)
     return errnoAsErrorCode();
   return std::error_code();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 } // end namespace fs
@@ -1349,6 +1377,7 @@ namespace path {
 bool home_directory(SmallVectorImpl<char> &result) {
   std::unique_ptr<char[]> Buf;
   char *RequestedDir = getenv("HOME");
+#if HAVE_PWD_H
   if (!RequestedDir) {
     long BufSize = sysconf(_SC_GETPW_R_SIZE_MAX);
     if (BufSize <= 0)
@@ -1360,6 +1389,7 @@ bool home_directory(SmallVectorImpl<char> &result) {
     if (pw && pw->pw_dir)
       RequestedDir = pw->pw_dir;
   }
+#endif
   if (!RequestedDir)
     return false;
 
diff --git a/llvm/lib/Support/Unix/Process.inc b/llvm/lib/Support/Unix/Process.inc
index ae90924cae1b9..02f984820590a 100644
--- a/llvm/lib/Support/Unix/Process.inc
+++ b/llvm/lib/Support/Unix/Process.inc
@@ -72,7 +72,11 @@ getRUsageTimes() {
 Process::Pid Process::getProcessId() {
   static_assert(sizeof(Pid) >= sizeof(pid_t),
                 "Process::Pid should be big enough to store pid_t");
+#if HAVE_GETPID
   return Pid(::getpid());
+#else
+  return Pid(0);
+#endif
 }
 
 // On Cygwin, getpagesize() returns 64k(AllocationGranularity) and
diff --git a/llvm/lib/Support/Unix/Unix.h b/llvm/lib/Support/Unix/Unix.h
index 1599241a344af..b6988b34832b3 100644
--- a/llvm/lib/Support/Unix/Unix.h
+++ b/llvm/lib/Support/Unix/Unix.h
@@ -30,7 +30,10 @@
 #include <cstring>
 #include <string>
 #include <sys/types.h>
-#include <sys/wait.h>
+
+#ifdef HAVE_SYS_WAIT_H
+# include <sys/wait.h>
+#endif
 
 #ifdef HAVE_UNISTD_H
 #include <unistd.h>
diff --git a/llvm/lib/Support/Unix/Watchdog.inc b/llvm/lib/Support/Unix/Watchdog.inc
index b33e52d88500d..dd65028769d37 100644
--- a/llvm/lib/Support/Unix/Watchdog.inc
+++ b/llvm/lib/Support/Unix/Watchdog.inc
@@ -19,13 +19,13 @@
 namespace llvm {
 namespace sys {
 Watchdog::Watchdog(unsigned int seconds) {
-#ifdef HAVE_UNISTD_H
+#ifdef HAVE_ALARM
   alarm(seconds);
 #endif
 }
 
 Watchdog::~Watchdog() {
-#ifdef HAVE_UNISTD_H
+#ifdef HAVE_ALARM
   alarm(0);
 #endif
 }
diff --git a/llvm/lib/Support/raw_socket_stream.cpp b/llvm/lib/Support/raw_socket_stream.cpp
index 14e2308df4d7e..ae3fba3593d8e 100644
--- a/llvm/lib/Support/raw_socket_stream.cpp
+++ b/llvm/lib/Support/raw_socket_stream.cpp
@@ -41,6 +41,8 @@
 
 using namespace llvm;
 
+#if defined(_WIN32) || HAVE_SOCKET
+
 #ifdef _WIN32
 WSABalancer::WSABalancer() {
   WSADATA WsaData;
@@ -304,3 +306,5 @@ raw_socket_stream::createConnectedUnix(StringRef SocketPath) {
 }
 
 raw_socket_stream::~raw_socket_stream() {}
+
+#endif

@llvmbot
Copy link
Member

llvmbot commented May 19, 2024

@llvm/pr-subscribers-clang

Author: Catherine (whitequark)

Changes

This patch is the first in a series that makes it possible to build LLVM, Clang, and LLD for WASI/WebAssembly. This patch does not introduce conditionals of the form defined(__wasi__) or defined(__wasm__); instead it detects these APIs like any other platform features. While some features are never present in WASI and the feature check is functionally identical to a platform check, some may be conditionally present if emulation is turned on, e.g. getpid.

The complete list of feature tests introduced is:

  • HAVE_ALARM: WebAssembly does not support delivery of asynchronous signals.
  • HAVE_GETHOSTNAME: WASI does not provide gethostname.
  • HAVE_GETPID: WASI does not have process IDs. However, it ships with a polyfill (-D_WASI_EMULATED_GETPID). The final behavior is the same either way.
  • HAVE_FCHOWN: WASI likely will never support UNIX ownership model.
  • HAVE_PWD_H: WASI likely will never support UNIX password databases.
  • HAVE_RAISE: WASI does not support delivey of asynchronous signals. However, it ships with a polyfill (-D_WASI_EMULATED_SIGNAL). This polyfill implements raise but not sigaction (only signal) and as a result it is not currently useful for LLVM. If sigaction is implemented in wasi-libc then raise(SIGABRT); would be able to invoke the handler.
  • HAVE_SETJMP: WebAssembly implements SjLj using exception handling. Exception handling has not been stabilized in Wasm yet. In addition, it significantly increases deployment complexity. Building with -mllvm -wasm-enable-sjlj enables the use of SjLj.
  • HAVE_SOCKET: WASIp1 does not provide the Berkeley socket API. WASIp2 does provide it. It will likely remain desirable to target WASIp1 for a long time.
  • HAVE_SYS_WAIT_H: WASI does not have subprocess management.
  • HAVE_UMASK: WASI likely will never support UNIX permission model, but umask might eventually be added in a polyfill.

Full diff: https://github.com/llvm/llvm-project/pull/92677.diff

11 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+3-3)
  • (modified) llvm/cmake/config-ix.cmake (+15)
  • (modified) llvm/include/llvm/Config/config.h.cmake (+33)
  • (modified) llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp (+6)
  • (modified) llvm/lib/Support/CrashRecoveryContext.cpp (+14-1)
  • (modified) llvm/lib/Support/LockFileManager.cpp (+1-1)
  • (modified) llvm/lib/Support/Unix/Path.inc (+30)
  • (modified) llvm/lib/Support/Unix/Process.inc (+4)
  • (modified) llvm/lib/Support/Unix/Unix.h (+4-1)
  • (modified) llvm/lib/Support/Unix/Watchdog.inc (+2-2)
  • (modified) llvm/lib/Support/raw_socket_stream.cpp (+4)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 2868b4f2b02e9..f8b7c76e6439d 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -99,8 +99,8 @@
 #include <optional>
 #include <set>
 #include <utility>
-#if LLVM_ON_UNIX
-#include <unistd.h> // getpid
+#if HAVE_GETPID
+#include <unistd.h>
 #endif
 
 using namespace clang::driver;
@@ -1577,7 +1577,7 @@ bool Driver::getCrashDiagnosticFile(StringRef ReproCrashFilename,
     CrashDiagDir = "/";
   path::append(CrashDiagDir, "Library/Logs/DiagnosticReports");
   int PID =
-#if LLVM_ON_UNIX
+#if HAVE_GETPID
       getpid();
 #else
       0;
diff --git a/llvm/cmake/config-ix.cmake b/llvm/cmake/config-ix.cmake
index bf1b110245bb2..a1a76bb6241db 100644
--- a/llvm/cmake/config-ix.cmake
+++ b/llvm/cmake/config-ix.cmake
@@ -49,6 +49,7 @@ check_include_file(malloc/malloc.h HAVE_MALLOC_MALLOC_H)
 if( NOT PURE_WINDOWS )
   check_include_file(pthread.h HAVE_PTHREAD_H)
 endif()
+check_include_file(pwd.h HAVE_PWD_H)
 check_include_file(signal.h HAVE_SIGNAL_H)
 check_include_file(sys/ioctl.h HAVE_SYS_IOCTL_H)
 check_include_file(sys/mman.h HAVE_SYS_MMAN_H)
@@ -57,6 +58,7 @@ check_include_file(sys/resource.h HAVE_SYS_RESOURCE_H)
 check_include_file(sys/stat.h HAVE_SYS_STAT_H)
 check_include_file(sys/time.h HAVE_SYS_TIME_H)
 check_include_file(sys/types.h HAVE_SYS_TYPES_H)
+check_include_file(sys/wait.h HAVE_SYS_WAIT_H)
 check_include_file(sysexits.h HAVE_SYSEXITS_H)
 check_include_file(termios.h HAVE_TERMIOS_H)
 check_include_file(unistd.h HAVE_UNISTD_H)
@@ -276,11 +278,15 @@ check_symbol_exists(__deregister_frame "${CMAKE_CURRENT_LIST_DIR}/unwind.h" HAVE
 check_symbol_exists(__unw_add_dynamic_fde "${CMAKE_CURRENT_LIST_DIR}/unwind.h" HAVE_UNW_ADD_DYNAMIC_FDE)
 
 check_symbol_exists(_Unwind_Backtrace "unwind.h" HAVE__UNWIND_BACKTRACE)
+check_symbol_exists(alarm unistd.h HAVE_ALARM)
+check_symbol_exists(gethostname unistd.h HAVE_GETHOSTNAME)
 check_symbol_exists(getpagesize unistd.h HAVE_GETPAGESIZE)
+check_symbol_exists(getpid unistd.h HAVE_GETPID)
 check_symbol_exists(sysconf unistd.h HAVE_SYSCONF)
 check_symbol_exists(getrusage sys/resource.h HAVE_GETRUSAGE)
 check_symbol_exists(setrlimit sys/resource.h HAVE_SETRLIMIT)
 check_symbol_exists(isatty unistd.h HAVE_ISATTY)
+check_symbol_exists(fchown unistd.h HAVE_FCHOWN)
 check_symbol_exists(futimens sys/stat.h HAVE_FUTIMENS)
 check_symbol_exists(futimes sys/time.h HAVE_FUTIMES)
 # AddressSanitizer conflicts with lib/Support/Unix/Signals.inc
@@ -298,10 +304,14 @@ check_symbol_exists(malloc_zone_statistics malloc/malloc.h
 check_symbol_exists(getrlimit "sys/types.h;sys/time.h;sys/resource.h" HAVE_GETRLIMIT)
 check_symbol_exists(posix_spawn spawn.h HAVE_POSIX_SPAWN)
 check_symbol_exists(pread unistd.h HAVE_PREAD)
+check_symbol_exists(raise signal.h HAVE_RAISE)
 check_symbol_exists(sbrk unistd.h HAVE_SBRK)
+check_symbol_exists(setjmp setjmp.h HAVE_SETJMP)
+check_symbol_exists(socket sys/socket.h HAVE_SOCKET)
 check_symbol_exists(strerror_r string.h HAVE_STRERROR_R)
 check_symbol_exists(strerror_s string.h HAVE_DECL_STRERROR_S)
 check_symbol_exists(setenv stdlib.h HAVE_SETENV)
+check_symbol_exists(umask sys/stat.h HAVE_UMASK)
 if( PURE_WINDOWS )
   check_symbol_exists(_chsize_s io.h HAVE__CHSIZE_S)
 
@@ -327,6 +337,11 @@ if( PURE_WINDOWS )
   check_function_exists(__cmpdi2 HAVE___CMPDI2)
 endif()
 
+check_c_source_compiles("
+        struct flock lk;
+        int main(void) { return 0; }"
+        HAVE_FLOCK)
+
 CHECK_STRUCT_HAS_MEMBER("struct stat" st_mtimespec.tv_nsec
     "sys/types.h;sys/stat.h" HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC)
 if (UNIX AND ${CMAKE_SYSTEM_NAME} MATCHES "AIX")
diff --git a/llvm/include/llvm/Config/config.h.cmake b/llvm/include/llvm/Config/config.h.cmake
index 977c182e9d2b0..8cbefffb2121d 100644
--- a/llvm/include/llvm/Config/config.h.cmake
+++ b/llvm/include/llvm/Config/config.h.cmake
@@ -23,6 +23,9 @@
    backslashes. */
 #cmakedefine01 LLVM_WINDOWS_PREFER_FORWARD_SLASH
 
+/* Define to 1 if you have the `alarm' function. */
+#cmakedefine HAVE_ALARM ${HAVE_ALARM}
+
 /* Define to 1 if you have the `backtrace' function. */
 #cmakedefine HAVE_BACKTRACE ${HAVE_BACKTRACE}
 
@@ -71,6 +74,12 @@
 /* Define to 1 if you have the <errno.h> header file. */
 #cmakedefine HAVE_ERRNO_H ${HAVE_ERRNO_H}
 
+/* Define to 1 if you have the `flock' structure. */
+#cmakedefine HAVE_FLOCK ${HAVE_FLOCK}
+
+/* Define to 1 if you have the `fchown' function. */
+#cmakedefine HAVE_FCHOWN ${HAVE_FCHOWN}
+
 /* Define to 1 if you have the <fcntl.h> header file. */
 #cmakedefine HAVE_FCNTL_H ${HAVE_FCNTL_H}
 
@@ -92,9 +101,15 @@
 /* Define to 1 if you have the `futimes' function. */
 #cmakedefine HAVE_FUTIMES ${HAVE_FUTIMES}
 
+/* Define to 1 if you have the `gethostname' function. */
+#cmakedefine HAVE_GETHOSTNAME ${HAVE_GETHOSTNAME}
+
 /* Define to 1 if you have the `getpagesize' function. */
 #cmakedefine HAVE_GETPAGESIZE ${HAVE_GETPAGESIZE}
 
+/* Define to 1 if you have the `getpid' function. */
+#cmakedefine HAVE_GETPID ${HAVE_GETPID}
+
 /* Define to 1 if you have the `getrlimit' function. */
 #cmakedefine HAVE_GETRLIMIT ${HAVE_GETRLIMIT}
 
@@ -161,9 +176,21 @@
 /* Have pthread_rwlock_init */
 #cmakedefine HAVE_PTHREAD_RWLOCK_INIT ${HAVE_PTHREAD_RWLOCK_INIT}
 
+/* Define to 1 if you have the <pwd.h> header file. */
+#cmakedefine HAVE_PWD_H ${HAVE_PWD_H}
+
+/* Define to 1 if you have the `raise' function. */
+#cmakedefine HAVE_RAISE ${HAVE_RAISE}
+
 /* Define to 1 if you have the `sbrk' function. */
 #cmakedefine HAVE_SBRK ${HAVE_SBRK}
 
+/* Define to 1 if you have the `setjmp' function. */
+#cmakedefine HAVE_SETJMP ${HAVE_SETJMP}
+
+/* Define to 1 if you have the `socket' function. */
+#cmakedefine HAVE_SOCKET ${HAVE_SOCKET}
+
 /* Define to 1 if you have the `setenv' function. */
 #cmakedefine HAVE_SETENV ${HAVE_SETENV}
 
@@ -209,12 +236,18 @@
 /* Define to 1 if you have the <sys/types.h> header file. */
 #cmakedefine HAVE_SYS_TYPES_H ${HAVE_SYS_TYPES_H}
 
+/* Define to 1 if you have the <sys/wait.h> header file. */
+#cmakedefine HAVE_SYS_WAIT_H ${HAVE_SYS_WAIT_H}
+
 /* Define if the setupterm() function is supported this platform. */
 #cmakedefine LLVM_ENABLE_TERMINFO ${LLVM_ENABLE_TERMINFO}
 
 /* Define to 1 if you have the <termios.h> header file. */
 #cmakedefine HAVE_TERMIOS_H ${HAVE_TERMIOS_H}
 
+/* Define to 1 if you have the `umask' function. */
+#cmakedefine HAVE_UMASK ${HAVE_UMASK}
+
 /* Define to 1 if you have the <unistd.h> header file. */
 #cmakedefine HAVE_UNISTD_H ${HAVE_UNISTD_H}
 
diff --git a/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp b/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
index 4f8f883a75f32..345a80e70f241 100644
--- a/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
+++ b/llvm/lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
@@ -34,7 +34,9 @@
 #include "llvm/Support/raw_ostream.h"
 #include <cassert>
 #include <cmath>
+#if HAVE_RAISE
 #include <csignal>
+#endif
 #include <cstdint>
 #include <cstdio>
 #include <cstring>
@@ -340,7 +342,11 @@ static GenericValue lle_X_exit(FunctionType *FT, ArrayRef<GenericValue> Args) {
 static GenericValue lle_X_abort(FunctionType *FT, ArrayRef<GenericValue> Args) {
   //FIXME: should we report or raise here?
   //report_fatal_error("Interpreted program raised SIGABRT");
+#if HAVE_RAISE
   raise (SIGABRT);
+#else
+  abort();
+#endif
   return GenericValue();
 }
 
diff --git a/llvm/lib/Support/CrashRecoveryContext.cpp b/llvm/lib/Support/CrashRecoveryContext.cpp
index f53aea177d612..d25590d4ff3a6 100644
--- a/llvm/lib/Support/CrashRecoveryContext.cpp
+++ b/llvm/lib/Support/CrashRecoveryContext.cpp
@@ -13,8 +13,13 @@
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/thread.h"
 #include <cassert>
+#if HAVE_RAISE
+#include <csignal>
+#endif
 #include <mutex>
+#if HAVE_SETJMP
 #include <setjmp.h>
+#endif
 
 using namespace llvm;
 
@@ -31,7 +36,9 @@ struct CrashRecoveryContextImpl {
   const CrashRecoveryContextImpl *Next;
 
   CrashRecoveryContext *CRC;
+#ifdef HAVE_SETJMP
   ::jmp_buf JumpBuffer;
+#endif
   volatile unsigned Failed : 1;
   unsigned SwitchedThread : 1;
   unsigned ValidJumpBuffer : 1;
@@ -72,9 +79,11 @@ struct CrashRecoveryContextImpl {
 
     CRC->RetCode = RetCode;
 
+#if HAVE_SETJMP
     // Jump back to the RunSafely we were called under.
     if (ValidJumpBuffer)
       longjmp(JumpBuffer, 1);
+#endif
 
     // Otherwise let the caller decide of the outcome of the crash. Currently
     // this occurs when using SEH on Windows with MSVC or clang-cl.
@@ -417,10 +426,12 @@ bool CrashRecoveryContext::RunSafely(function_ref<void()> Fn) {
     CrashRecoveryContextImpl *CRCI = new CrashRecoveryContextImpl(this);
     Impl = CRCI;
 
+#if HAVE_SETJMP
     CRCI->ValidJumpBuffer = true;
     if (setjmp(CRCI->JumpBuffer) != 0) {
       return false;
     }
+#endif
   }
 
   Fn();
@@ -469,9 +480,11 @@ bool CrashRecoveryContext::throwIfCrash(int RetCode) {
     return false;
 #if defined(_WIN32)
   ::RaiseException(RetCode, 0, 0, NULL);
-#else
+#elif HAVE_RAISE
   llvm::sys::unregisterHandlers();
   raise(RetCode - 128);
+#else
+  abort();
 #endif
   return true;
 }
diff --git a/llvm/lib/Support/LockFileManager.cpp b/llvm/lib/Support/LockFileManager.cpp
index 083f8d7b37be3..edfa8825b3337 100644
--- a/llvm/lib/Support/LockFileManager.cpp
+++ b/llvm/lib/Support/LockFileManager.cpp
@@ -94,7 +94,7 @@ static std::error_code getHostID(SmallVectorImpl<char> &HostID) {
   StringRef UUIDRef(UUIDStr);
   HostID.append(UUIDRef.begin(), UUIDRef.end());
 
-#elif LLVM_ON_UNIX
+#elif HAVE_GETHOSTNAME
   char HostName[256];
   HostName[255] = 0;
   HostName[0] = 0;
diff --git a/llvm/lib/Support/Unix/Path.inc b/llvm/lib/Support/Unix/Path.inc
index 6e679f74869f0..250e946781446 100644
--- a/llvm/lib/Support/Unix/Path.inc
+++ b/llvm/lib/Support/Unix/Path.inc
@@ -32,7 +32,9 @@
 #endif
 
 #include <dirent.h>
+#ifdef HAVE_PWD_H
 #include <pwd.h>
+#endif
 #include <sys/file.h>
 
 #ifdef __APPLE__
@@ -673,6 +675,11 @@ static void expandTildeExpr(SmallVectorImpl<char> &Path) {
     return;
   }
 
+#if !defined(HAVE_PWD_H)
+  // No access to password database, return back the original path.
+  (void)Remainder;
+  return;
+#else
   // This is a string of the form ~username/, look up this user's entry in the
   // password database.
   std::unique_ptr<char[]> Buf;
@@ -694,6 +701,7 @@ static void expandTildeExpr(SmallVectorImpl<char> &Path) {
   Path.clear();
   Path.append(Entry->pw_dir, Entry->pw_dir + strlen(Entry->pw_dir));
   llvm::sys::path::append(Path, Storage);
+#endif
 }
 
 void expand_tilde(const Twine &path, SmallVectorImpl<char> &dest) {
@@ -770,11 +778,15 @@ std::error_code status(int FD, file_status &Result) {
 }
 
 unsigned getUmask() {
+#if HAVE_UMASK
   // Chose arbitary new mask and reset the umask to the old mask.
   // umask(2) never fails so ignore the return of the second call.
   unsigned Mask = ::umask(0);
   (void)::umask(Mask);
   return Mask;
+#else
+  return 0022;
+#endif
 }
 
 std::error_code setPermissions(const Twine &Path, perms Permissions) {
@@ -1224,6 +1236,7 @@ Expected<size_t> readNativeFileSlice(file_t FD, MutableArrayRef<char> Buf,
 }
 
 std::error_code tryLockFile(int FD, std::chrono::milliseconds Timeout) {
+#if HAVE_FLOCK
   auto Start = std::chrono::steady_clock::now();
   auto End = Start + Timeout;
   do {
@@ -1241,9 +1254,13 @@ std::error_code tryLockFile(int FD, std::chrono::milliseconds Timeout) {
     usleep(1000);
   } while (std::chrono::steady_clock::now() < End);
   return make_error_code(errc::no_lock_available);
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code lockFile(int FD) {
+#if HAVE_FLOCK
   struct flock Lock;
   memset(&Lock, 0, sizeof(Lock));
   Lock.l_type = F_WRLCK;
@@ -1253,9 +1270,13 @@ std::error_code lockFile(int FD) {
   if (::fcntl(FD, F_SETLKW, &Lock) != -1)
     return std::error_code();
   return errnoAsErrorCode();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code unlockFile(int FD) {
+#if HAVE_FLOCK
   struct flock Lock;
   Lock.l_type = F_UNLCK;
   Lock.l_whence = SEEK_SET;
@@ -1264,6 +1285,9 @@ std::error_code unlockFile(int FD) {
   if (::fcntl(FD, F_SETLK, &Lock) != -1)
     return std::error_code();
   return errnoAsErrorCode();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 std::error_code closeFile(file_t &F) {
@@ -1335,11 +1359,15 @@ std::error_code real_path(const Twine &path, SmallVectorImpl<char> &dest,
 }
 
 std::error_code changeFileOwnership(int FD, uint32_t Owner, uint32_t Group) {
+#if HAVE_FCHOWN
   auto FChown = [&]() { return ::fchown(FD, Owner, Group); };
   // Retry if fchown call fails due to interruption.
   if ((sys::RetryAfterSignal(-1, FChown)) < 0)
     return errnoAsErrorCode();
   return std::error_code();
+#else
+  return std::error_code(ENOSYS, std::generic_category());
+#endif
 }
 
 } // end namespace fs
@@ -1349,6 +1377,7 @@ namespace path {
 bool home_directory(SmallVectorImpl<char> &result) {
   std::unique_ptr<char[]> Buf;
   char *RequestedDir = getenv("HOME");
+#if HAVE_PWD_H
   if (!RequestedDir) {
     long BufSize = sysconf(_SC_GETPW_R_SIZE_MAX);
     if (BufSize <= 0)
@@ -1360,6 +1389,7 @@ bool home_directory(SmallVectorImpl<char> &result) {
     if (pw && pw->pw_dir)
       RequestedDir = pw->pw_dir;
   }
+#endif
   if (!RequestedDir)
     return false;
 
diff --git a/llvm/lib/Support/Unix/Process.inc b/llvm/lib/Support/Unix/Process.inc
index ae90924cae1b9..02f984820590a 100644
--- a/llvm/lib/Support/Unix/Process.inc
+++ b/llvm/lib/Support/Unix/Process.inc
@@ -72,7 +72,11 @@ getRUsageTimes() {
 Process::Pid Process::getProcessId() {
   static_assert(sizeof(Pid) >= sizeof(pid_t),
                 "Process::Pid should be big enough to store pid_t");
+#if HAVE_GETPID
   return Pid(::getpid());
+#else
+  return Pid(0);
+#endif
 }
 
 // On Cygwin, getpagesize() returns 64k(AllocationGranularity) and
diff --git a/llvm/lib/Support/Unix/Unix.h b/llvm/lib/Support/Unix/Unix.h
index 1599241a344af..b6988b34832b3 100644
--- a/llvm/lib/Support/Unix/Unix.h
+++ b/llvm/lib/Support/Unix/Unix.h
@@ -30,7 +30,10 @@
 #include <cstring>
 #include <string>
 #include <sys/types.h>
-#include <sys/wait.h>
+
+#ifdef HAVE_SYS_WAIT_H
+# include <sys/wait.h>
+#endif
 
 #ifdef HAVE_UNISTD_H
 #include <unistd.h>
diff --git a/llvm/lib/Support/Unix/Watchdog.inc b/llvm/lib/Support/Unix/Watchdog.inc
index b33e52d88500d..dd65028769d37 100644
--- a/llvm/lib/Support/Unix/Watchdog.inc
+++ b/llvm/lib/Support/Unix/Watchdog.inc
@@ -19,13 +19,13 @@
 namespace llvm {
 namespace sys {
 Watchdog::Watchdog(unsigned int seconds) {
-#ifdef HAVE_UNISTD_H
+#ifdef HAVE_ALARM
   alarm(seconds);
 #endif
 }
 
 Watchdog::~Watchdog() {
-#ifdef HAVE_UNISTD_H
+#ifdef HAVE_ALARM
   alarm(0);
 #endif
 }
diff --git a/llvm/lib/Support/raw_socket_stream.cpp b/llvm/lib/Support/raw_socket_stream.cpp
index 14e2308df4d7e..ae3fba3593d8e 100644
--- a/llvm/lib/Support/raw_socket_stream.cpp
+++ b/llvm/lib/Support/raw_socket_stream.cpp
@@ -41,6 +41,8 @@
 
 using namespace llvm;
 
+#if defined(_WIN32) || HAVE_SOCKET
+
 #ifdef _WIN32
 WSABalancer::WSABalancer() {
   WSADATA WsaData;
@@ -304,3 +306,5 @@ raw_socket_stream::createConnectedUnix(StringRef SocketPath) {
 }
 
 raw_socket_stream::~raw_socket_stream() {}
+
+#endif

Copy link

github-actions bot commented May 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@whitequark whitequark force-pushed the main+wasm+pr branch 2 times, most recently from 4110cfe to 0a86da8 Compare May 19, 2024 05:37
@whitequark whitequark requested a review from AaronBallman May 20, 2024 17:02
@whitequark
Copy link
Collaborator Author

I can see the build failure--I think it's related to the changes to crash reporting, but I'm not sure how yet. I'll fix it but I'd still very much like the feedback on the general approach.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not responding sooner, but I think this approach makes sense to me. It's basically similar to autoconf where we're checking whether a feature is supported and guarding against it.

One concern I have is that someone adding new code may think they need to come up with fallback behavior in case a feature isn't available (e.g., "how do I do this if I don't have setjmp?) and it would be nice if we could find a way to make it clear that they are not responsible for coming up with that fallback behavior. (Mostly worried about folks doing a bunch of heavy lifting before submitting a PR only to hear during review "you didn't need to do that, please pull it out because we don't have a way to test it".)

@whitequark
Copy link
Collaborator Author

One concern I have is that someone adding new code may think they need to come up with fallback behavior in case a feature isn't available

Shall I add some comments to config.h indicating as such?

Copy link
Collaborator

Comments in the config file isn't a harmful thing, but I'm also worried about folks who never see the config file. e.g., they're working in Path.inc on something, see HAVE_PWD_H in the code, and just make assumptions from there.

So perhaps a two-pronged approach: a comment in config.h for the cases where we don't expect a graceful fallback, and a similar comment around the guarded #include in source for the folks who won't look in config.h?

@whitequark
Copy link
Collaborator Author

Comments in the config file isn't a harmful thing, but I'm also worried about folks who never see the config file. e.g., they're working in Path.inc on something, see HAVE_PWD_H in the code, and just make assumptions from there.

So perhaps a two-pronged approach: a comment in config.h for the cases where we don't expect a graceful fallback, and a similar comment around the guarded #include in source for the folks who won't look in config.h?

I could do that. What do you think of the fairly strongly worded objection in the RFC thread that requested I use defined(__wasi__) instead of individual conditionals?

Copy link
Collaborator

Ah thank you for pointing that out, I had missed some updates on that thread.

I think defined(__wasi__) would address my concerns. @MaskRay asked a good question about whether we should tie this to LLVM_ON_UNIX as well and I don't have a strong intuition there, but my naive thinking is that it would help in some cases but be insufficient in others (e.g., within Support/Unix/Path.inc, we'd still need to have a conditional for WASI).

@whitequark
Copy link
Collaborator Author

but my naive thinking is that it would help in some cases but be insufficient in others (e.g., within Support/Unix/Path.inc, we'd still need to have a conditional for WASI).

LLVM_ON_UNIX is definitely a mixed bag for WASI and I'll take another look at whether it should be defined or not when I rework the patch to use defined(__wasi__) instead.

@whitequark whitequark force-pushed the main+wasm+pr branch 2 times, most recently from 66199b7 to 0f15e9b Compare July 13, 2024 10:28
@whitequark
Copy link
Collaborator Author

I've updated the PR to use almost exclusively defined(__wasi__)/defined(__wasm__), with the exception of HAVE_SETJMP, where you could conceivably have both kinds of builds on WebAssembly depending on your needs.

@whitequark whitequark force-pushed the main+wasm+pr branch 2 times, most recently from 1112971 to b64ed26 Compare July 13, 2024 11:18
@whitequark
Copy link
Collaborator Author

Failed test is:


******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/41/43' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\ws\src\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-43852-41-43.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=43 GTEST_SHARD_INDEX=41 C:\ws\src\build\unittests\Support\.\SupportTests.exe
--
Script:
--
C:\ws\src\build\unittests\Support\.\SupportTests.exe --gtest_filter=LockFileManagerTest.Basic
--
C:\ws\src\llvm\unittests\Support\LockFileManagerTest.cpp(35): error: Expected: (LockFileManager::LFS_Owned) != (Locked2.getState()), actual: 0 vs 0
C:\ws\src\llvm\unittests\Support\LockFileManagerTest.cpp:35
Expected: (LockFileManager::LFS_Owned) != (Locked2.getState()), actual: 0 vs 0
********************

This doesn't seem to make any sense since expected == actual here?

@whitequark
Copy link
Collaborator Author

whitequark commented Jul 13, 2024

Now that there are instructions for trying the Wasm-hosted build, and I've verified that it works on relatively simple examples (such as <stdio.h> hello world as well as <iostream> hello world), I think this PR can progress forward. Once it is merged I will set up the nightly builds proper; I have the infrastructure for it set up in the YoWASP/clang repo but it can't be enabled until this PR is merged.

I can see the buildbot failures; I'm not yet sure what's up with them, any pointers are appreciated.

@whitequark
Copy link
Collaborator Author

I've further confirmed that clang.wasm can build complex software by building Yosys for Wasm using clang.wasm. It works fine and is able to translate a simple design; this makes it clear that a mature codebase with a mixture of C and C++ code can compile, link, and produce a working artifact using this port, so I have a high degree of confidence in this PR now.

This patch makes it possible to build LLVM, Clang, and LLD for
WASI/WebAssembly. This patch introduces conditionals of the form
`defined(__wasi__)` or `defined(__wasm__)` wherever necessary to detect
the use of the WASI platform. In addition, it introduces a `HAVE_SETJMP`
feature test macro because the WASI platform can have or lack support
for this feature depending on compiler options.
Copy link
Member

@jyknight jyknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this would be improved by splitting off a bunch of the lib/Support/Unix/*.inc files into separate lib/Support/WASI/*.inc files. Some of these files do seem to make sense to share, e.g. Path.inc, but seems to me like most of the rest are almost 100% stubbed out for WASI, and it would be clearer to have those no-op stubs in a separate WASI dir.

@@ -45,8 +45,11 @@ void lld::unlinkAsync(StringRef path) {
if (!sys::fs::exists(path) || !sys::fs::is_regular_file(path))
return;

// If threads are disabled, remove the file synchronously.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a check below,

  if (parallel::strategy.ThreadsRequested == 1)
    return;

Probably that should be a constant in no-threads builds, then maybe this change isn't needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I don't quite understand. Are you asking me to change ThreadPoolStrategy::ThreadsRequested to be a constant? I could do that but it seems like an invasive change. Looking at other uses of ThreadsRequested, it looks like many of them are guarded in essentially the same way as what this patch introduces, e.g. here:

#if LLVM_ENABLE_THREADS
: Parallel((parallel::strategy.ThreadsRequested != 1) &&
(threadIndex == UINT_MAX)) {}
#else
: Parallel(false) {}
#endif

or here:

void llvm::parallelFor(size_t Begin, size_t End,
llvm::function_ref<void(size_t)> Fn) {
#if LLVM_ENABLE_THREADS
if (parallel::strategy.ThreadsRequested != 1) {

#if !defined(__wasi__)
#include <csignal>
#endif
#if LLVM_ENABLE_THREADS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate; there's no reason (in theory) that mutex shouldn't work fine in single-threaded mode.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually does work, since I ended up building LLVM always with the wasm32-wasip1-threads target, and this is a remnant of an earlier patch.

@@ -94,7 +94,7 @@ static std::error_code getHostID(SmallVectorImpl<char> &HostID) {
StringRef UUIDRef(UUIDStr);
HostID.append(UUIDRef.begin(), UUIDRef.end());

#elif LLVM_ON_UNIX
#elif !defined(__wasi__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're changing behavior for !LLVM_ON_UNIX (like Windows) to use this branch now. That seems probably unintended. Meant #elif LLVM_ON_UNIX && !defined(__wasi__)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is the cause of buildbot failures that I missed during my own review.

@@ -508,6 +512,9 @@ static bool is_local_impl(struct STATVFS &Vfs) {
#elif defined(__EMSCRIPTEN__)
// Emscripten doesn't currently support remote filesystem mounts.
return true;
#elif defined(__wasi__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__EMSCRIPTEN__ || __wasi__ is effectively equal __wasm__; merge clauses?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does emscripten no longer have asm.js mode? Last time I checked it did, but I'm not sure if it still does today...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't; C++ compilation only targets wasm.

The emscripten driver still supports generating javascript, but only via invoking the binaryen "wasm2js" tool post-link -- which translates the wasm binary into javascript.

@@ -252,13 +257,21 @@ std::error_code Process::FixupStandardFileDescriptors() {

if (NullFD == StandardFD)
FDC.keepOpen();
#if defined(__wasi__)
else // WASI does not have `dup2` by design. Return EBADF.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does wasi have /dev/null? If not, this whole function is pointless. If so, that could be opened directly, instead. Actually, given the number of changes here for wasi, shouldn't wasi just have its own llvm/lib/Support/WASI/Process.inc file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, no /dev/null.

@@ -71,6 +71,7 @@ ProcessInfo::ProcessInfo() : Pid(0), ReturnCode(0) {}
ErrorOr<std::string> sys::findProgramByName(StringRef Name,
ArrayRef<StringRef> Paths) {
assert(!Name.empty() && "Must have a name!");
#if !defined(__wasi__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this function work in wasi?

Although, actually, since process spawning doesn't even work in wasi, again, ISTM wasi should have its own llvm/lib/Support/WASI/Program.inc, which is just full of stub implementations, instead of ifdef'ing everything out here.

@@ -273,9 +273,19 @@ static bool printMarkupStackTrace(StringRef Argv0, void **StackTrace, int Depth,
}

// Include the platform-specific parts of this class.
#ifdef LLVM_ON_UNIX
#if defined(__wasi__)
// WASI does not have signals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these stubs to WASI/Signals.inc for consistency with the other platform support files.

@@ -30,7 +30,10 @@
#include <cstring>
#include <string>
#include <sys/types.h>

#if !defined(__wasi__)
#include <sys/wait.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this header even needed on other platforms? I don't see anything clearly 'wait' related in this header.

@@ -19,13 +19,13 @@
namespace llvm {
namespace sys {
Watchdog::Watchdog(unsigned int seconds) {
#ifdef HAVE_UNISTD_H
#if HAVE_UNISTD_H && !defined(__wasi__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this could again be switched to a WASI/Watchdog.inc, with stub impls.

@@ -41,6 +41,8 @@

using namespace llvm;

#if defined(_WIN32) || !defined(__wasi__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional is weird; there's no case where __wasi__ is defined at the same time as _WIN32, so you just mean #if !defined(__wasi__).

Also you've removed the entire type's implementation, but not the declaration in the header, which is broken.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_socket_stream isn't actually used anywhere except for tests, so this doesn't break anything. Should it perhaps be removed instead?

@whitequark
Copy link
Collaborator Author

@jyknight Thanks for the review! I will split the stubs off.

Regarding threading, there is an unfortunate combination of issues here:

  1. The wasm32-wasip1 target doesn't have e.g. std::mutex available at all. This makes it impossible to build LLVM and the amount of conditional compilation is such that (as far as I understand) it has no chance of being merged.
  2. The wasm32-wasip1-threads target, with -pthread enabled, does have those primitives, but using this target causes the compiled executables to use shared memory (even if they don't spawn threads), which makes some downstream tools like jco reject such binaries even if they don't spawn threads.
  3. In some cases (see LLD above), LLVM compiled without LLVM_ENABLE_THREADS would still statically link to thread spawning primitives, which would make the executables unusable on hosts which do not support threads (even if they support atomics and accept memories marked as shared).

It wasn't clear to me initially what the way forward is, but it seems like it's not really feasible to conditionalize each use of std::mutex and instead WASI sysroots should provide a single-threaded implementation of synchronization primitives for embeddings which do not spawn threads, rather than placing this burden on code targeting Wasm.

@jyknight
Copy link
Member

It wasn't clear to me initially what the way forward is, but it seems like it's not really feasible to conditionalize each use of std::mutex and instead WASI sysroots should provide a single-threaded implementation of synchronization primitives for embeddings which do not spawn threads, rather than placing this burden on code targeting Wasm.

Absolutely. This is, perhaps, a "bug" (enhancement request) for libc++: its _LIBCPP_HAS_NO_THREADS mode currently deletes the mutex and related types, but it should instead make them stubs. (it should continue to delete the thread types like it does today.) Alternatively, perhaps it would be good to stub it out at the pthread API level, so that C code is also handled.

So, I think it'd be good to get this PR in with only supporting WASI-with-threads for now, then return to wasi-without-threads after addressing the above.

@whitequark
Copy link
Collaborator Author

Alternatively, perhaps it would be good to stub it out at the pthread API level, so that C code is also handled.

This is being discussed in wasi-libc and might be a good way forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category cmake Build system in general and CMake in particular lld:COFF lld:ELF lld:MachO lld:wasm lld llvm:adt llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants